Introduction

The outbreak of coronavirus disease 2019 (COVID-19) caused by the new coronavirus (SARS-CoV-2) infection has prompted worldwide attempts to develop efficient molecules to treat the disease and symptoms (Samantaray et al. 2021). However, developing novel molecules culminating in translation against infections can be laborious, time-consuming, and expensive (Paul et al. 2010). Thus, identifying the therapeutically effective entity against the disease from a pre-existent clinically approved repository of molecules may be advantageous (Ciliberto et al. 2020).

Virus infections such as SARS-CoV-2 reprogram the host cells to consume more glucose and upregulate metabolic activities such as glycolysis, akin to the Warburg effect and alter glycosylation to survive, replicate, and transmit infections (Mullen et al. 2021). Similar to glucose, internalization of 2-DG is facilitated by glucose transporters followed by its phosphorylation into inactive metabolite, 2-deoxyglucose-6-phosphate. This glucose deprivation in the cells leads to reduced proliferation and induction of apoptosis (Schmidt and O’Donnell 2021).

2-Deoxy-d-glucose (2-DG) has been tested to inhibit glycolysis and hence SARS-CoV-2 replication in monocytes and epithelial cells consequently leading to increased HIF-1α and reduced inflammatory mediators (Pliszka and Szablewski 2021; Medini et al. 2021; Codo et al. 2020). It was reported that 2-DG prevents viral replication by hindering virus DNA polymerase (Codo et al. 2020; Liu et al. 2021) attaching to specified receptors on the cell surface and obstructing viral invasion into the target cells; blocking viral protein synthesis, obstructing delayed phases of virus assembly (Codo et al. 2020). The metabolic processes such as glycolysis in the cytoplasm and glycosylation in the endoplasmic reticulum can be interrupted using glucose mimics such as 2-deoxy-D-glucose (2-DG) (Xi et al. 2014). By impeding viral replication, high energy requirements, and viral assembly, it could be a potential therapeutic candidate (Khurana et al. 2022).

The reports from the different phases of clinical trials have shown that 2-DG aids in improving the health status of severely Covid-19-infected individuals and decreases oxygen therapy dependency. It was found that a large number of 2-DG-treated patients reported negative within 5 days (Goel 2021; Wang et al. 2021). 2-DG as an anti-viral agent has previously been reported wherein the inhibition of replication of enveloped viruses such as herpes simplex virus (Courtney et al. 1973), measles virus, respiratory syncytial virus (Hodes et al. 1975), Semiliki forest virus, and Sindbis virus (Kaluza et al. 1972) are demonstrated. In an in vivo study, 2-DG inhibited rhinovirus load and inflammation in mice (Gualdoni et al. 2018). Several proteins such as non-structural protein 1 (Nsp1), RNA-dependent RNA polymerase, 3CLpro are the attractive targets involved in COVID-19 treatment (Singh et al. 2021, 2022). In silico analysis suggested efficient binding of 2-DG with SARS-CoV-2 viral main protease 3CLpro and NSP15 endoribonuclease (Balkrishna et al. 2020). As considerable knowledge on molecular interaction between 2-DG and SARS-CoV-2 and drug response is lacking, there is an absolute requisite to integrate the information from 2DG interacting genes by in silico analysis. The genes and their products are regulated by various mechanisms that involve correlation between many processes, metabolic pathways, and regulatory factors (Vohra et al. 2021). One prevalent form of gene variants is single nucleotide polymorphisms (SNPs), where two different bases appear at a remarkable rate in human diversity (Prabhu et al. 2021). The genetic profiling based on the identified and functionally characterized SNPs is considered a “fingerprint”, possibly used to determine the risk of disease susceptibility and drug response (Shastry 2007). Many variants residing in non-coding and non-regulatory sequences are functionally silent. However, few SNPs alter the structure and function of the protein. The role of functional SNPs, which can alter the regulation and structure of the protein in relation to the effects of 2-DG, is not well understood. These functional SNPs are considered an ideal substrate for the human population in health and illness (Alwi 2005).

Hence, the current study is aimed to investigate the influence of functional or regulatory SNPs on the potency and pernicious effect of 2-DG. Therefore, the main purpose of the research was to examine the impact of SNPs in the 2-DG interacting pathway genes by interrogating various bioinformatics resources and assessed the influence of SNPs on the protein stability, miRNA regulation, and cis-acting elements to evolve a relationship for pharmacogenomics purposes.

Materials and methods

Identification of interacting genes of 2-DG

The interacting genes of 2-DG were retrieved from the Comparative Toxicogenomic Database (CTD) (Grondin et al. 2021) using the parameter named chemical-gene interaction in Homo sapiens. UniProt database (Uniprot Consortium 2021) was used to retrieve the data of all the 2-DG interacting gene families, and further, these data were utilized for downstream analysis.

Pathway interaction among 2-DG interacting genes

The 2-DG interacting genes were subjected to the Cytoscape tool v3.0 Software ClueGO v2.5.8 (Bindea et al. 2009) was employed to identify the networks in the degree sorted circular layout to interpret the biological function of the selected genes. The distinct ontologies such as molecular function, pathways, and human diseases were used in the framework, and the GO terms were connected using kappa statistics based on the overlapping genes.

Retrieval and characterization of SNPs

For the selected genes, SNPs were retrieved by preferring the option variant table in the Ensembl genome browser (m.ensembl.org). The retrieved SNPs were further classified into missense variants, 5′-UTR variants, 3′-UTR variants, synonymous SNPs, intronic SNPs, splice donor, splice acceptor variants, splice region SNPs, stop retained SNPs, stop-loss SNPs, stop-gained SNPs, and non-coding transcript exon variants. Among these, missense SNPs were considered for further functional analysis.

In silico prediction of missense variants functional impacts

The selected missense variants were scrutinized utilizing six diverse tools with mutation score accessible in the Ensembl genome browser, and these included CADD (Combined Annotation-Dependent Depletion), Mutation assessor, SIFT (Sorting Intolerant from Tolerant), Revel (Rare exome variant ensemble learner), MetaLR, and PolyPhen-2 (Polymorphism Phenotyping). The SNPs characterized as “deleterious” in all the tools were carefully chosen and evaluated for their effect on protein structure and stability.

Protein modeling and mutation effect on protein stability

To interpret the effect of deleterious SNPs on protein structure, we predicted the native and mutant forms by protein modeling. The predicted model of the native form was available from the AlphaFold protein structure database (Jumper et al. 2021), and the mutant form of the protein structure was modeled using an automated protein structure homology-modeling server, SWISS-MODEL via Expasy webserver (Waterhouse et al. 2018), by considering the native predicted model as a template. The alteration in the hydrophilicity or hydrophobicity for the deleterious SNPs due to the amino acid change is presented using the hydropathy index (Kyte et al. 1982). The stability of the protein was determined based on point mutation using the CUPSAT mutation tool (Parthiban et al. 2006) of the 3D AlphaFold structure of variants retrieved from UniProt database. Using Swiss-PDB Viewer (Kaplan and Littlejohn 2001), the energy minimization using the steepest descent algorithm was performed for the mutated protein model with the corresponding amino acid substitution, compared with the native protein model, followed by total energy calculations. The root-mean-square deviation (RMSD) was calculated using align function from Pymol software to find the divergence in mutant form from the native form of the protein (Yuan et al. 2017).

Prediction of functional microRNA target SNPs

The identified 2-DG-associated genes were deployed to predict the SNPs in the microRNA binding sites that were functional using three databases. These were microRNA-related Single Nucleotide Polymorphisms v3 (miRNASNP3) (Gong et al. 2015), PolymiRTS database (Bhattacharya et al. 2014), and miRNA-related SNPs (MirSNP) database (Liu et al. 2012). The MirSNP database was utilized to investigate the miRNA binding SNP locations and their consequences on the target position. Furthermore, the PolymiRTS database was employed to obtain the variants and their concomitant miRNAs at wild and mutant alleles and assessed their effect on the target gain/loss in the 3′-UTR using the miRNASNP3 database.

SNPs at the transcription factor binding site (TFBS)

The shortlisted 2-DG interacting genes were utilized to obtain the SNPs in TFBS employing SNP2TFBS (Kumar et al. 2017). The parameter named annotated variants were employed to obtain the SNPs residing in the upstream and 5′-UTR regions. The SNPInspector in Genomatix Software Suite (https://www.genomatix.de/) was applied to predict if SNPs in TFBS generate or destroy the TF binding sites.

Enhancers SNPs

The identified 2-DG-associated genes were further utilized to analyze the influence of SNPs residing in enhancers using EnhancerDB (Kang et al. 2019) and ENCODE laboratories software HaploReg version 4.1 (Ward and Kellis 2012). The search preference comprising gene was utilized in the EnhancerDB database to retrieve the enhancer SNPs of the shortlisted genes. Further, HaploReg v4.1 was used to evaluate the regulatory motifs of the enhancer SNPs that were altered.

Results

Identification of interacting genes for 2-DG

We identified 48 interacting genes for 2-DG (Table 1) and plotted their position using the Circos ideogram. The depiction indicated the distribution of genes over 21 autosomes and X chromosome except for 13 autosome and Y chromosome (Fig. S1). The overview of plot shows 48 genes (from outer ring inwards), 5′-UTR SNPs, intronic SNPs, 3′-UTR SNPs, synonymous SNPs, missense variants, splice variants (splice region, splice donor, splice acceptor), start lost, stop-lost, stop-gained, stop-lost SNPs, inner most ring constitutes non-coding transcript exon variant and NMD transcript variant. The schematic illustration of in silico workplan is shown in Fig. 1.

Table 1 Details of selected 2DG interacting genes for downstream analysis
Fig. 1
figure 1

Schematic representation of in silico workflow of the study

Pathway interaction among 2-DG interacting genes

The interaction among 2-DG genes constituted a network after employing the statistical option Enrichment/Depletion test (two-sided hypergeometric test) (Fig. S2). The resulting network indicated 13 Kappa score groups such as apoptotic factor-mediated response, the intrinsic pathway for apoptosis, cytochrome C-mediated apoptotic response, interleukin-4, and interleukin-13 signaling, integration of energy metabolism, macroautophagy, purinergic signaling in leishmaniasis infection, ATF6 alpha activates chaperone genes, mTOR signaling, FOXO-mediated transcription, protease binding, collagen-binding and SARS-CoV infections (Fig. S2). It was found that 45.87% of the associated genes (CASP3, CASP9, MAPK1, MAPK3, XIAP) contributed to cytochrome C-mediated apoptotic response and 3.67% of the associated genes (BECN1, FXYD2, GSK3B, HSP90AA1, ITGB1, MAP1LC3B) contributed to SARS-CoV infections (Fig. S2).

SNPs characterization

A sum of 9,66,482 SNPs was obtained by using the Ensembl genome browser (m.ensembl.org) from human genome assembly GRCh38.p13 (1000 Genomes Project). The retrieved variants were mined which generated 1,04,034 SNPs. These shortlisted variants were further classified depending on their function. These SNPs were from 5′-UTR (295), intronic regions (27,917), 3′-UTR (1729), synonymous SNPs (519), splice variants of the genes including splice donor, acceptor, splice region (119), non-coding transcript exons (103), 8 stop-gained, stop-lost SNP (1), NMD-transcript variants (20), and 616 were missense variants (Fig. 2).

Fig. 2
figure 2

Schematic representation of in silico SNP search and characterization

Selection of lethal nsSNPs

Among 616 missense SNPs, SIFT analysis predicted 248 SNPs (40.25%) as “deleterious”, however, the prediction rate of mutation by PolyPhen-2 was 149 (24.18%) as “probably damaging”. CADD, Revel, Meta LR, and Mutation Assessor reported 27 SNPs (4.38%), 109 SNPs (17.69%), 116 SNPs (18.83%), and 464 SNPs (6.49%) as likely deleterious, likely disease-causing, damaging, and high, respectively (Fig. S3). A total of six diverse bioinformatic resources, such as CADD, Mutation assessor, SIFT, Revel MetaLR, and PolyPhen-2 collectively showed three lethal missense variants (Fig. 3); A2M rs201769751, rs778604418, and PARP1 rs193238922 (Table S1).

Fig. 3
figure 3

Pathogenicity predictions showing common deleterious non-synonymous SNPs

Protein modeling and mutation effect on protein stability

Out of three deleterious SNPs identified, A2M (rs778604418) and PARP1 (rs193238922) showed a change in hydrophobicity or hydrophilicity, but none of them showed a change in its polarity. The change in polarity and hydrophobicity may affect the protein structure and its activity. The divergence in free energy of unfolding between native form and mutant form of proteins known as ΔΔG is calculated by CUPSAT tool using structural environment-specific atom capability and torsion angle capability. Henceforth, the stability of the protein was identified in terms of predicted ΔΔG values (kcal/mol). Out of three deleterious SNPs, A2M (rs778604418) showed more stability with a predicted ΔΔG value of 3.35 kcal/mol and A2M (rs201769751), PARP1 (rs193238922) affects the protein stability with predicted ΔΔG value of − 5.07 kcal/mol and − 0.51 kcal/mol, respectively (Table S2). The native form of the protein A2M (AlphaFold ID: AF-P01023-F1), PARP1 (AlphaFold ID: AF-P09874-F1) was retrieved from the AlphaFold database, and the mutant form was modeled and validated using the Ramachandran plot. The mutant model showed that 95% of the amino acids were present in the favorable region and considered for further in silico analysis. The native and mutant protein forms of deleterious SNPs along with overlapping models were shown (Fig. 4). A high QMEAN score and sequence identity from the swiss model was considered for the superimposition of the mutant model over the native structure and visualized using Swiss-PDB Viewer. The total energy of mutant structures in all three polymorphisms was less compared to native protein structures. Hence, it is believed that these three deleterious SNPs may affect the protein structure and function. Further, the calculated RMSD value for A2M (rs201769751, rs778604418) and PARP1 (rs193238922) were 0.052 Å, 0.047 Å, and 0.221 Å, respectively. It is reported that the higher the RMSD value, the greater the deviation between the native and mutant forms of the protein structures, which in turn indicates the change in its functional activity. The total energy and RMSD value of native and mutant forms of all the polymorphisms are tabulated in Table S3.

Fig. 4
figure 4

Native, mutant and superimposition of native and mutant modeled structures of the A2M (1) rs201769751 (2) rs778604418 (3) PARP1 rs193238922. a Structure of native protein, b enlarged structure of native protein, c structure of mutant protein, d enlarged structure of mutant protein, e superimposed model of native and mutant protein structures, f enlarged superimposed model of native and mutant protein structures

Prediction of functional microRNA target SNPs

The functional microRNA targeting SNPs were predicted using three different resources, and these were miRNASNP3, PolymiRTS, and MirSNP, which concomitantly reported 12 SNPs (rs11552192 in the BECN1, rs60393216 in the GSK3B, rs9903 in the MAP1LC3B, rs1065154, and rs10277 in the SQSTM1, rs10415095 in the STK11, rs28382747, rs28382755, rs28382752, rs28382740, rs28382742, rs17330644 in the XIAP) with the minor allele frequency (MAF) of 10% in the microRNA binding sites. It also indicates any miRNAs linked with SNPs residing in the target position would create or destroy a miRNA-mRNA binding site (Table 2).

Table 2 Details of miRNA target site SNPs with minor allele frequency > 0.1

SNPs at the transcription factor binding site (TFBS)

A sum of 22 SNPs was found to be in TFBS with MAF > 0.1 by SNP2TFBS; among them, 17 and 5 SNPs reside in the upstream and 5′-UTR region, respectively. Further, SNPInspector projected that rs712829 in the EGFR generates 15 TFBS; rs60221525 in the BECN1, rs4645980 in the CASP9, and rs5393 in the SLC2A2 impaired binding position for 14 transcription factors (TFs). The effect of 22 SNPs at TFBS revealed those SNPs that would generate or disrupt the positions for the binding of TFs (Table 3).

Table 3 Impact of SNPs in the transcription factor binding site with MAF > 0.1

SNPs in enhancers

The two databases, namely, EnhancerDB and HaploReg were employed to identify SNPs in the enhancers which unanimously identified 42 SNPs residing in the introns and 1 3′-UTR SNP with MAF > 0.1. Out of 43 SNPs, rs3795063 in the STK11 gene showed 19 regulatory motifs that were altered which included CAC-binding-protein, CACD, E2A, Egr-1, Irf, Klf4, Klf7, Myc, Myf, NRSF, Pou2f2, Rad21, SMC3, SP1, SP4, TATA, UF1H3BETA, YY1, and Zfp740. The rs10861203 in the HSP90B1 gene reported 14 regulatory motifs that were altered and these included BCL, BDP1, ELF1, Egr-1, Ets, FEV, Myc, NERF1a, Nrf-2, Pax-5, STAT, TBX5, Tel2, and p300. The specifics of SNPs residing in the enhancers and their altered regulatory motifs are catalogued (Table 4).

Table 4 SNPs in enhancers and their altered regulatory motifs with MAF > 0.1

Discussion

Detection of therapeutically effective entity counter to the disease from a pre-existent molecule repository may substantially reduce the time and efforts against new drug discovery and clinical trial randomization. The approach of repurposing the existing drugs has resulted in the detection of a large number of effective molecules for the treatment of COVID-19 infection (Ciliberto et al. 2020).

In order to simplify the overview of large number of 2-DG interacting genes that has been extracted, massive number of SNPs residing in respective genes were mined and characterized based on their location. The distribution of these SNPs was depicted by circos which is an unambiguous representation to lessen the inherent complexities and consider the density and dynamic range within huge data sets (Krzywinski et al. 2009). Further, our in silico approach has detected 80 genetic variants associated with 2-DG interacting genes using diverse bioinformatics resources. Therefore, an assessment of these variants was performed by employing various SNP prediction resources and by choosing the overlapping SNPs to overcome the false-positive findings. The pathway analysis aids in investigating interrelationships of terms and functional groups that constitute biological networks (Bindea et al. 2009). The pathway analysis of 2-DG interacting genes emphasized various processes: cytochrome C-mediated apoptotic response, interleukin-4, and interleukin-13 signaling, among others. Interestingly, the assessment also indicated susceptible gene loci for SARS-CoV infections. The pathway assessment among 2-DG interacting genes also highlighted apoptosis-related signaling mediated by the caspase family of proteins which may modify the metabolism of cells and enhance the rate of cell death (Gioti et al. 2021) and its potential role in viral infection inhibition (Plassmeyer et al. 2021). Cell death due to 2-DG in various tumor cells has been reported and could be mediated by ER stress/autophagy in HCT116 colon cancer cells or through Cytochrome C-Caspase 3-PARP axis in certain other cells (Maximchik et al. 2018). Similarly, A2M which is a key anti-inflammatory protease can induce cell proliferation when ligated to chaperon GRP78 by increasing the glucose uptake (Vandooren and Itoh 2021). GRP78 also accumulates upon ER stress induced by 2-DG thus sequentially increasing its uptake when provided in place of glucose (Kim et al. 2018). Thus, any structural alterations in A2M may determine the efficacy of 2-DG treatment.

Often 3′-UTR and less frequently exon bound miRNAs silence and regulate the genes at a posttranscriptional level. The variations due to SNPs introduced into the miRNA binding regions may diminish binding affinity and consequently affect its function (Prabhu et al. 2021). We extracted the SNP information of 2-DG interacting genes to unravel the miRNA binding sites employing three databases namely miRNASNP3, MirSNP, and PolymiRTS and examined whether or not miRNAs linked polymorphisms residing in the target region would generate or disrupt a miRNA-mRNA binding region. The findings of our study showed the impact of two miRNA target SNPs (rs1065154, rs10277) residing in the SQSTM1 gene which could create or break at ancestral and mutant allele. Expression quantitative trait loci analysis is a robust technique toward determining genetic loci linked with quantitative variations in gene expression. After employing Genome-Wide Association analysis to the set of records containing approximately 3,00,000 SNPs and 48,000 mRNA expression traits from high throughput technique, researchers found 1226 significant associations, out of which 95 associations were linked to ADME of drugs. The variant rs10277 residing in the gene SQSTM1 in human liver samples reported that allele C is linked with increased transcription compared to allele T. These data broaden our understanding regarding the genetic features of inter-individual variation in gene expression in conjunction with specific prominence on pharmacogenomics (Table S4) (Schröder et al. 2011; Whirl-Carrillo et al. 2012).

In this study, the influence of polymorphisms in TFBS and enhancers were also analyzed. The massive number of genetic variants detected from GWAS resides in the genome’s noncoding region and are of significant interest when located in regulatory sites such as promoters and enhancers as these variants may influence gene expression and these may play a major role in the complex traits that elicits drug response. Thus, we screened the 5′-UTR and upstream SNPs of the selected genes to verify whether the substitution of SNP allele and modified TF binding sites would possibly perturb gene regulation (Buroker 2017).

Pathogenic and other exposures cause leucocytes to respond quickly, with effects ranging from cytokine generation to migration and engulfing by phagocytosis (Marsin et al. 2002; Yang et al. 2012; Wahl et al. 2012). Activation of mononuclear cells with lipopolysaccharide enhanced the production of cytokines IL-1B, IL-6, and TNF-alpha, as predicted (Fangradt et al. 2012; Freemerman et al. 2014). Accelerated glycolytic flow produces ATP quickly to meet these critical processes, which are bioenergetically expensive (Palsson-Mcdermott and O’Neill 2013; Macintyre and Rathmell 2013). For all three cytokines, the competitive glycolysis inhibitor 2-DG dramatically inhibited lipopolysaccharide-mediated generation of cytokines (Jones et al. 2015). Our findings reported rs1143627 residing in IL1B generated two TFBS for PTBP, MYT1. One of the studies proclaimed that rs1143627 residing in the gene IL1B was found to be associated with Influenza A susceptibility in humans. The findings also showed that aged adults or individuals of any age with comorbid or immunosuppressive conditions might be at a greater risk of disease development. IL1B rs1143627 was also considered to be susceptibility alleles in individuals suffering from liver fibrosis infected by the hepatitis B virus (Wu et al. 2018). Extensive data reported the role of two variants, namely, rs712829 residing in EGFR gene and rs1143627 in IL1B gene in NCI-60 cancer cell lines and human samples, highlighting the effect of genotype on neoplasms and psoriasis on the usage of diverse drugs molecules (Tables S4, S5) (Whirl-Carrillo et al. 2012). Additionally, enhancers that regulate gene expression function as rheostats for transcription, which will further tune up the levels of specific transcripts (Corradin and Scacheri 2014). Henceforth, in the current study 43 SNPs have shown a wide spectrum of altered motifs that may result in gene regulation.

Due to the complexity of the infection, an apt determinative model and efficacious medication for COVID-19 infection are yet to be evolved. As the innate immune system is inadequate to produce a powerful immune response counter to the virus, multi-targeted factors that mitigate viral infection, replication, and host immune reactions are warranted. In the present study, a sum of three polymorphisms (SQSTM1 rs10277, IL1B rs1143627, EGFR rs712829) of 2-DG interacting genes may increase the susceptibility to SARS-CoV infections than other polymorphisms. However, these identified polymorphisms need to be considered by experimental validation of the likelihoods proposed in the current work is required in larger cohorts for repurposing the drug. Further, this in silico study was conducted to shed light on the pharmacogenomic concerns of 2-DG against SARS-CoV-2. We believe that the selected variants in the current study should be wisely considered to overcome adverse drug reaction and to strengthen the foundation for future medical exploration. Nevertheless, it is universally believed that an SNP acts through neighboring genes when it is most likely connected to a phenotype or illness. Therefore, it is undeniable that the present strategy may overlook certain associated genes.

Conclusions

In the current in silico study, efforts were made to identify the genetic biomarkers of 2-DG interacting genes, which may determine the risk of gene polymorphisms and drug response. The in silico data mining strategy aids predominantly in finding the drug interacting genes, and their respective pathways and supports in assessing the influence of SNPs in distinct genic regions. Eventually, the information creates an integrated foundation to delineate the intricate molecular relationships among 2-DG interacting genes and may subsequently provide insight to predict COVID-19 infection risk and treatment strategies with 2-DG.