Bioinformatics screening of colorectal-cancer causing molecular signatures through gene expression profiles to discover therapeutic targets and candidate agents

Horaira, Md Abu; Islam, Md. Ariful; Kibria, Md. Kaderi; Alam, Md. Jahangir; Kabir, Syed Rashel; Mollah, Md. Nurul Haque

doi:10.1186/s12920-023-01488-w

Bioinformatics screening of colorectal-cancer causing molecular signatures through gene expression profiles to discover therapeutic targets and candidate agents

Research
Open access
Published: 29 March 2023

Volume 16, article number 64, (2023)
Cite this article

Download PDF

You have full access to this open access article

BMC Medical Genomics Aims and scope Submit manuscript

Bioinformatics screening of colorectal-cancer causing molecular signatures through gene expression profiles to discover therapeutic targets and candidate agents

Download PDF

Md Abu Horaira¹,
Md. Ariful Islam¹,
Md. Kaderi Kibria¹,
Md. Jahangir Alam¹,
Syed Rashel Kabir² &
…
Md. Nurul Haque Mollah¹

4831 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Background

Detection of appropriate receptor proteins and drug agents are equally important in the case of drug discovery and development for any disease. In this study, an attempt was made to explore colorectal cancer (CRC) causing molecular signatures as receptors and drug agents as inhibitors by using integrated statistics and bioinformatics approaches.

Methods

To identify the important genes that are involved in the initiation and progression of CRC, four microarray datasets (GSE9348, GSE110224, GSE23878, and GSE35279) and an RNA_Seq profiles (GSE50760) were downloaded from the Gene Expression Omnibus database. The datasets were analyzed by a statistical r-package of LIMMA to identify common differentially expressed genes (cDEGs). The key genes (KGs) of cDEGs were detected by using the five topological measures in the protein–protein interaction network analysis. Then we performed in-silico validation for CRC-causing KGs by using different web-tools and independent databases. We also disclosed the transcriptional and post-transcriptional regulatory factors of KGs by interaction network analysis of KGs with transcription factors (TFs) and micro-RNAs. Finally, we suggested our proposed KGs-guided computationally more effective candidate drug molecules compared to other published drugs by cross-validation with the state-of-the-art alternatives of top-ranked independent receptor proteins.

Results

We identified 50 common differentially expressed genes (cDEGs) from five gene expression profile datasets, where 31 cDEGs were downregulated, and the rest 19 were up-regulated. Then we identified 11 cDEGs (CXCL8, CEMIP, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, ZG16, CLCA4, MS4A12 and CLDN1) as the KGs. Different pertinent bioinformatic analyses (box plot, survival probability curves, DNA methylation, correlation with immune infiltration levels, diseases-KGs interaction, GO and KEGG pathways) based on independent databases directly or indirectly showed that these KGs are significantly associated with CRC progression. We also detected four TFs proteins (FOXC1, YY1, GATA2 and NFKB) and eight microRNAs (hsa-mir-16-5p, hsa-mir-195-5p, hsa-mir-203a-3p, hsa-mir-34a-5p, hsa-mir-107, hsa-mir-27a-3p, hsa-mir-429, and hsa-mir-335-5p) as the key transcriptional and post-transcriptional regulators of KGs. Finally, our proposed 15 molecular signatures including 11 KGs and 4 key TFs-proteins guided 9 small molecules (Cyclosporin A, Manzamine A, Cardidigin, Staurosporine, Benzo[A]Pyrene, Sitosterol, Nocardiopsis Sp, Troglitazone, and Riccardin D) were recommended as the top-ranked candidate therapeutic agents for the treatment against CRC.

Conclusion

The findings of this study recommended that our proposed target proteins and agents might be considered as the potential diagnostic, prognostic and therapeutic signatures for CRC.

View this article's peer review reports

Identification of candidate biomarkers and therapeutic drugs of colorectal cancer by integrated bioinformatics analysis

Article 19 October 2020

Identification and verification of HCAR3 and INSL5 as new potential therapeutic targets of colorectal cancer

Article Open access 21 August 2021

Gamma-Aminobutyric Acid Type A Receptor Subunit Pi is a potential chemoresistance regulator in colorectal cancer

Article 25 January 2023

Introduction

Colorectal cancer (CRC) is an uncontrolled cell growth in the colon, rectum or appendix. It is the second most commonly diagnosed cancer in females and the third in males. The world health organization (WHO) reported in 2018 that over 1.8 million new cases and nearly 862,000 deaths due to CRC worldwide [1, 2]. With more than 2.2 million new cases and 1.1 million fatalities, the global incidence of CRC is projected to be increased 60% by 2030 [3]. The early stages of CRC symptoms are uncharacteristic and frequently ignored or misdiagnosed. Importantly, CRC is diagnosed at the middle or late stages of the disease. It is characteristically identified at the middle or late stages of the disease. The fecal-based examination, enteroscopy and blood-based examination are commonly considered the early detection methods for CRC [4]. However, several instrument-dependent detection methods are time-consuming, laborious and expensive. The leading treatment options for CRC are surgery, adjuvant chemotherapy (for colon cancer), neoadjuvant radiotherapy (for rectal cancer), and molecular drugs [5, 6]. However, these types of treatments have several drawbacks. According to the previous studies, less than 15% of metastatic CRC is suitable for surgery, the spreading rate of CRC exceeds more than 80% within 3 years after surgery, and the spreading rate exceeds more than 95% within 5 years after surgery [7]. Although there are some advancement in the case of CRC treatments, the 5-year survival time of patients with this disease has not yet increased significantly [6]. Therefore, the identification of new molecular biomarkers is essential for CRC diagnosis, prognosis and new therapies.

However, new drug discovery is a tremendously challenging, time-consuming and expensive task. The main challenges are to explore drug target proteins (receptors) responsible for the diseases and drug agents (small molecules) that can reduce the diseases by the interaction with the target proteins. Genomic biomarkers induced proteins are considered as the key receptors. Transcriptomics analysis is a widely used popular approach to explore genomic biomarkers [8,9,10,11,12,13]. The repurposing of existing drugs for certain diseases could reduce the time and cost compared to de novo drug development. By this time, several authors have suggested different sets of key genes (KGs) to explore molecular mechanisms and pathogenetic processes of CRC progression [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45] in which some studies have employed multiple datasets to identify CRC-causing KGs [15,16,17, 22, 25, 26, 31, 32, 37, 40,41,42,43, 46,47,48,49]. Few studies also explored their suggested KGs-guided candidate drug molecules for the treatment against CRC [14, 37, 40,41,42, 50,51,52,53,54,55,56,57,58,59]. However, their published data did not display any common KGs as well as common drug molecules (see Additional file 1: Table S1) in all studies. None of those studies investigated the resistance performance of their suggested KGs-guided drug molecules against the CRC-causing independent KGs suggested by others. We found CRC-causing 170 different KGs and associated 64 different drug molecules in those articles. The articles those suggested therapeutics agents applied enrichment approach on Cmap, geneXpharma or DGIdb databases to select the KGs-guided candidate agents for the treatment against CRC [14, 37, 40,41,42, 50,51,52,53,54,55,56,57,58,59]. They did not provide pairwise drug-target binding affinity scores, since enrichment techniques cannot calculate pairwise binding scores. So, it may be difficult to select most potential drug-target pairs from the existing studies for experimental validation by the wet-lab researchers. On the other hand, though the total number of KGs 170 is much smaller than the whole genome size, it may be yet much laborious, time consuming and costly for the experimental validation of more than 170 × 64 = 10,880 drug-target pairs by the wet-lab researchers. Therefore, in this study, our main objectives were to explore (1) more probable CRC-causing KGs from multiple gene expression profile datasets through the verification with different benchmark datasets and independent databases and (2) proposed KGs-guided candidate drug molecules for the treatment of CRC through the verification of their resistance power against the CRC-causing top-ranked independent KGs suggested by others, by molecular docking analysis.

Materials and methods

The overview of this study including materials and methods is summarized in Fig. 1.

Data sources and descriptions

We collected gene expression profiles generated from CRC patients for exploring drug targets and small molecules (drug agents) for exploring candidate drugs by molecular docking simulation as described below.

Collection of gene expression profiles for exploring drug-target proteins (receptors)

Four human CRC microarray datasets (GSE9348, GSE110224, GSE23878, and GSE35279) and one RNA-Seq dataset (GSE50760) were downloaded from National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). The platform of GSE9348, GSE110224, and GSE23878, were GPL570 [HG-U133_Plus_2] (Affymetrix Human Genome U133 Plus 2.0 Array), GSE35279 was performed by GPL6480 (Agilent-014850 Whole Human Genome Microarray 4 × 44 K G4112F) and GSE50760 was performed by GPL11154 Illumina HiSeq 2000 (Homo sapiens). The summary of this dataset is given in Table 1.

Table 1 Details of gene expression profiles that we analyzed

Full size table

Collection of meta-drug agents for exploring candidate drugs

We collected meta-drug agents from the online database DSigDB [60] with respect to the proposed receptors and FDA approved repurposed drugs for the treatment of CRC patients.

Collection of independent meta-receptors for cross-validation with the proposed drugs

To select the top-ranked receptor proteins (meta-receptors) associated with CRC, we reviewed 33 recently published articles and selected the top-ranked 8 target proteins as the meta-receptors (see Additional file 1: Table S1).

Integrated statistics and bioinformatics approaches

To reach the goal of this study, we applied both statistical and bioinformatics approaches, as discussed below in detail.

Identification of DEGs by using LIMMA

To identify differentially expressed (DEGs) between tumor and normal conditions, we considered the linear models for microarray (LIMMA) data analysis suggested by Smith [61], which can be written as

$$z_{g} = Y\theta_{g} + u_{g}$$

(1)

where $z_{g} = \left( {z_{g1} ,z_{g2} , \ldots ,z_{gn} } \right)^{/}$ is the vector of expressions (responses) for gth gene with n = n₁ + n₂ samples (g = 1, 2, …, m), Y is an n × 2 design matrix, $\theta_{g} = \left( {\theta_{g1} ,\theta_{g2} } \right)^{/}$ is 2 × 1 vector (2 < n) of effects for two different groups of n samples, and the error vector $u_{g} \sim N(0, W_{g} \sigma_{g}^{2}$). Here $W_{g}$ is a positive definite weight matrix. We want to test the null hypothesis (H₀): $\theta_{g1} = \theta_{g2} = > \gamma_{g} = (\theta_{g1} - \theta_{g2} ) = 0$ (that is, gth gene is equally expressed gene (EEG) in both case and control groups) against the alternative hypothesis (H₁): $\theta_{g1} \ne \theta_{g2} = > \gamma_{g} \ne 0$ (that is, the gth gene is a DEG between case and control groups). To test H₀ against H₁, the moderated t-statistic was formulated by hybridizing the classical and Bayesian approaches in which the posterior variance is substituted into the classical t-statistic in place of the classical sample variance. The moderated t-statistic was defined as

$$\tilde{t}_{g} = \frac{{\hat{z}_{g} - z_{g} }}{{\tilde{s}_{g} \sqrt {\delta_{g} } }},$$

(2)

which follows t-distribution with $d_{g} + d_{0}$ degrees of freedom under H₀.

Adjusted P values based on the moderated t-statistics and the average of log fold-change (aLog2FC) values of the treatment group with respect to the control group were used to select DEGs as follows

$${\text{DEG}}_{{\text{g}}} = \left\{ {\begin{array}{*{20}l} {DEG \left( {{\text{Upregulated}}} \right),} \hfill & {if\quad adj.p.value\left\langle {0.01\,\, and\,\, aLogFC} \right\rangle 1.0} \hfill \\ {DEG \left( {{\text{Downregulated}}} \right),} \hfill & {if\quad adj.p.value < 0.01\,\, and\,\, aLogFC < - 1.0} \hfill \\ \end{array} } \right.$$

(3)

where

$${\text{aLog}}2{\text{FC}} = \left\{ {\begin{array}{*{20}l} {\frac{1}{{n_{1} }}\mathop \sum \limits_{i}^{{n_{1} }} \log 2(z_{{g_{i} }}^{T} ) - \frac{1}{{n_{2} }}\mathop \sum \limits_{j}^{{n_{2} }} \log 2(z_{{g_{j} }}^{C} ),} \hfill & { if\quad n_{1} \ne n_{2} } \hfill \\ {\frac{1}{n}\mathop \sum \limits_{i}^{n} \log 2\left( {\frac{{z_{{g_{i} }}^{T} }}{{z_{{g_{j} }}^{C} }}} \right), } \hfill & {if\quad n_{1} = n_{2} = n} \hfill \\ \end{array} } \right.$$

Here $z_{{g_{i} }}^{T}$ and $z_{{g_{j} }}^{C}$ are the expressions for the gth gene with the ith treatment and jth control samples, respectively. We implemented this algorithm using LIMMA r-package to calculate the P values [62] and aLogFC values to select the DEGs significantly from four gene expression datasets as introduced previously. We separated upregulated and downregulated DEGs for each of four datasets. Then we selected common upregulated and downregulated DEGs for all of four datasets. Then we combine common upregulated DEGs and common downregulated DEGs to construct the common DEGs (cDEGs) set.

Construction of PPI network to identify CRC-causing key genes (KGs)

Protein–protein interaction (PPI) network was constructed to identify common key-genes (KGs). The online STRING-v11 [63] database was used to construct the PPI network of cDEGs. The String database provides critical assessment and integration of protein interactions, including direct (physical) and indirect (functional) associations. To construct a PPI network, the distance ‘D’ between pair of proteins (u,v) is calculated as

$$D\left( {u,v} \right) = \frac{{2\left| {N_{u} \cap N_{v} } \right|}}{{|N_{u} \left| { + |N_{v} } \right|}}$$

(4)

where N_u is the neighbor set of u and N_v is the neighbor set of v. Cytoscape plugin cytoHubba was used to rank the nodes of the PPI network, which could be utilized to identify KGs in the network [64, 65]. In the present study, five topological methods, including Degree [66], BottleNeck [67], Betweenness [68], and Stress [69] was utilized to identify KGs.

In-silico validation of CRC-causing KGs

An attempt was made to validate the CRC-causing KGs by using different web-tools and independent databases as introduced below.

Expression analysis for KGs by GEPIA web-tool with TCGA RNA-seq data

To validate the expression levels of key genes, a gene expression profiling interactive analysis (GEPIA) tool (http://gepia.cancer-pku.cn/) was used to explore the related data in TCGA databases, and to analyse the expression levels of key genes in CRC tissues compared with normal tissues [70].

Association of KGs with the immune infiltration levels in different cancers including CRC

Tumor Immune Estimation Resource (TIMER) is an integrative resource for investigating the molecular characterization of tumor-immune interactions across various cancer types (https://cistrome.shinyapps.io/timer/) [71]. TIMER utilizes a deconvolution statistical method to deduce the abundance of six tumor-infiltrating immune cells, including B cells, CD4⁺ T cells, CD8⁺ T cells, macrophages, neutrophils and DCs from The Cancer Genome Atlas (TCGA).

DNA methylation of KGs

MethSurv is used to explore methylation biomarkers associated with the survival of various human cancers [72]. MethSurv is freely available at https://biit.cs.ut.ee/methsurv. Through the MethSurv website, we will analyze the DNA methylation analysis of the selected CRC-related genes in the TCGA database.

Association of KGs with different disease

The Disease-KGs enrichment analysis was performed using the Enrichr web tool [73] with DisGeNET database [74] to explore other disease risk factors for CRC patients.

Prognostic power analysis of KGs

To investigate the prognostic power of KGs, we performed cluster analysis, survival analysis and developed two prediction models using random forest (RF) and AdaBoost classifiers. The survival curve and ROC curve were used to assess the prognosis performance. The online SurvExpress computational tool [75] was used to produce a survival curve. The r-packages ‘gplots’ and ‘ROCR’ were used to produce heatmap and ROC curve, respectively. Exploring drugs by molecular docking simulation.

Exploring GO and KEGG pathway terms that are associated with DEGs including KGs

The GO (Gene Ontology) functions [76] and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis [77] were performed to explore CRC-causing ontology terms (Biological Process (BP), Cellular Component (CC), and Molecular Function (MF)) and pathways that are associated with cDEGs including KGs. To explore the significantly enriched GO terms and KEGG pathways by cDEGs including KGs, let S_i is the annotated gene-set corresponding to the ith type of biological functions or pathways given in the database, and M_i is the number of genes in S_i (i = 1, 2,…,r); N is the total number of annotated genes those construct the entire combine set $S = \mathop \cup \limits_{i = 1}^{r} S_{{\text{i}}} = S_{i} \cup S_{i}^{c}$ such that $N \le \mathop \sum \limits_{i = 1}^{r} M_{i} ;$ where $S_{i}^{c}$ is the complement set of S_i. Again, let n is the total number of cDEGs of interest and k_i is the number of cDEGs belonging to the annotated gene-set S_i. This problem is summarized by the following contingency table (Table 2):

Table 2 Contingency table

Full size table

To find the significantly enriched GO terms and KEGG pathways by our proposed cDEGs, the P value was calculated by the Fisher exact test statistic based on the hypergeometric distribution. We used Enrichr online tool to perform Fisher exact test [78].

Regulatory network analysis of KGs

To identify key transcription factors (TFs) as the transcriptional regulators of KGs, the TFs-KGs interaction network was constructed using the publicly available database JASPAR [79]. The interaction network was generated using NetworkAnalyst [80]. To identify key microRNAs (miRNAs) as the post-transcriptional regulators of KGs, the KGs-miRNAs interaction network was constructed by using the publicly available online tool TarBase v8.0 (Release 7.0) [81]. The top degree miRNAs were selected from the networks (miRNAs-KGs) and considered as key miRNAs.

Molecular docking simulation for exploring candidate drug agents

To explore efficient FDA approved repurposed drugs for the treatment of CRC patients, we employed molecular docking simulation between the target receptor proteins and drug agents. We considered our proposed KGs based hub-proteins and associated TFs proteins as the drug target receptor proteins and meta-drug agents collected from online databases and published articles for docking analysis. The molecular docking simulation requires 3-Dimensional (3D) structures of both receptor proteins and candidate drugs. We downloaded the 3D structure of all targeted receptor proteins from Protein Data Bank (PDB) [82] and SWISS-MODEL [66]. The 3D structures of drug agents were downloaded from the PubChem database [83]. The 3D structure of the target proteins was visualized using Discovery Studio Visualizer 2019 [84], and the water molecules, co-crystal ligands which were bound to the protein were removed. Further, the protein was prepared using Swiss-PdbViewer [85] and AutoDock Vina [86] in PyRx open-source software by adding charges and minimizing the energy of the protein and subsequently converting it to pdbqt format [86, 87]. The exhaustiveness parameter was set to 8. The Discovery Studio Visualizer 2019 was used to analyze the docked complexes for surface complexes, types and distances of non-covalent bonds. Let A_ij denotes the binding affinity between ith target protein (i = 1, 2, …, m) and jth drug agent (j = 1, 2, …, n). Then target proteins are ordered according to the descending order of row sums $\sum\nolimits_{j = 1}^{n} {A_{ij} }$, j = 1, 2, …, m, and drug agents are ordered according to the descending order of column sums $\sum\nolimits_{i = 1}^{m} {A_{ij} }$, j = 1,2, …, n, to select the top ranking few drug agents as the candidate drugs. Then we validated the proposed repurposed drugs by molecular docking simulation with the top ordered independent receptor proteins associated with CRC published by others.

Results

Identification of cDEGs

We identified 50 cDEGs, including 19 up-regulated (Fig. 2A) and 31 down-regulated (Fig. 2B) genes in CRC tissue, using adj.P.Val < 0.01 and logFC > 1 as the threshold for down-regulated cDEGs, and adj.P.Val < 0.01 and logFC < -1 for up-regulated cDEGs. The down and up regulated cDEGs were displayed on the right and left sides respectively in the volcano plot (Fig. 3 and Additional file 2).

Identification of key genes (KGs) from cDEGs

The PPI network of cDEGs was constructed using the STRING database, which includes 49 nodes and 175 edges, with an average node degree of 6.73 and P value < 1.0e−16. In the PPI network, Red color indicates up-regulated and black color indicates down-regulated cDEGs, big size and octagon shape indicate common key genes (KGs) (Fig. 4). We used four topological measures (Degree, BottleNeck, Betweenness, and Stress) to select top-ranked 11 KGs (Table 3) that are CXCL8, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, CEMIP, ZG16, CLCA4, MS4A12 and CLDN1, where 4 KGs (CXCL8, CEMIP, CLDN1, and MMP7) were up-regulated and the rest 7 KGs were downregulated.

Table 3 Selection of KGs by combining the top ranked genes of five topological measurements with the PPI network

Full size table

In-silico validation of CRC-causing KGs by using different web-tools and independent databases

Expression analysis for KGs by GEPIA web-tool with TCGA RNA-seq data

In the GEPIA database, differences in transcriptional expression of the hub gene between CRC tissues and normal tissues were again verified. Combining with the box plot results, eleven potential KGs further were screened out. Based on the GEPIA database to test the relative expression of KGs mRNA, it was determined that our proposed KGs (CXCL8, CEMIP, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, ZG16, CLCA4, MS4A12 and CLDN1) may be closely related to the occurrence and development of CRC (Fig. 5).

Correlation between KGs and immune infiltration levels in different cancers including CRC

We investigated the relationship of different tumors ininfiltrates immune cell types (B cell, CD8 T cell, CD4 T cell, neutrophil, macrophage and dendritic cell (DC)) with the expressions of KGs (Additional file 3). We observed (Additional file 4) that our proposed KGs are significantly associated with different tumor infiltrates immune cells under different databases of COAD (colon adenocarcinoma) and READ (Rectum adenocarcinoma). Compelling evidence has demonstrated that tumor-infiltrating lymphocytes are significantly associated with survival in cancer. Therefore, we investigated whether KGs expression was related to immune infiltration levels in lung cancer by TIMER. Tumor purity is an important factor affecting the analysis of immune infiltration. Interestingly, our results indicated that KGs expression was correlated with poor prognosis and high immune infiltration in CRC. KGs were highly expressed in monocytes (non-classical and classical) and B cells (naïve). In contrast, KGs expression was not significantly correlated with tumor purity or infiltrating levels of CD8⁺ T cells, CD4⁺ T cells or neutrophils in CRC. CA4 expression levels were positively correlated with infiltrating levels of B cells (r = 0.22, P = 2.47E−04), CD8⁺ T cells (r = − 0.16, P = 7.49E−03), CD4⁺ T cells (r = 0.19, P = 1.99E−03), macrophages (r = − 0.18, P = 2.19E−03), neutrophils (r = 0.38, P = 4.6E−11) and DCs (r = 0.23, P = 9.73E−05) in COAD (Additional files 3 and 4). CLCA4 expression levels were also positively correlated with infiltrating levels of B cells (r = 0.32, P = 1.92E−03), CD8⁺ T cells (r = − 0.23, P = 2.95E−02), CD4⁺ T cells (r = 0.39, P = 1.36E−04), macrophages (r = − 0.48, P = 8.42E−07), neutrophils (r = 0.5, P = 4.50E−07) and DCs (r = 0.33, P = 1.10E−03) in READ (Additional file 3 and 4). These findings strongly suggest that KGs plays an important role in immune infiltration in CRC, especially infiltration of Macrophage, T cell CD8⁺, T cell CD4⁺, Neutrophil, Myeloid dendritic cell, and B cell.

DNA methylation of KGs

DNA methylation at CpG (CG) sites play the vital role in cancer progression. Therefore, we investigated DNA methylation of KGs (CXCL8, CEMIP, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, ZG16, CLCA4, MS4A12 and CLDN1) at CpG sites by MethSurv web-tool with TCGA database. We observed that seven KGs (CEMIP, MMP7, CA4, GUCA2B, ZG16, CLCA4, MS4A12) are significantly methylated at CpG sites (Table 4). The hypermethylation/downregulation gene CEMIP has six CpG sites with a P value < 0.05, the hypomethylation/upregulation gene GUCA2B has four CpG sites with a P value of < 0.05, the hypomethylation/upregulation gene MS4A12 has two CpG sites with a P value < 0.05, and the hypomethylation/upregulation gene MMP7, CLCA4, ZG16 has one CpG site with a P value of < 0.05, which is statistically significant (Table 4). We found that the difference in DNA methylation between CG12358698 of CEMIP, CG23532119 of MS4A12, CG00656728 of GUCA2B, CG24963041 of MMP7, CG26310643 of CLCA4, CG09229061 of ZG16, CG00200645 of CA4 and CG07510230 of ZNRF2 was most pronounced.

Table 4 The significant prognostic value of CpG in three key genes

Full size table

Association of KGs with different diseases including CRC

The disease-KGs interaction analysis showed that KGs are significantly associated with different types of colon or rectal cancers including Malignant tumor of colon, Colonic Neoplasms, Adenomatous Polyps, Adenocarcinoma, Adenoma of large intestine, Colorectal Neoplasms, Adenocarcinoma of colon, Colon Carcinoma, Stage III Colon Cancer AJCC v7, Stage III Colon Cancer, Intestinal Neoplasms, Adenoma and Metastatic Neoplasm (Fig. 6 and Table S2 in Additional file 1).

Prognostic power analysis

We considered both supervised and unsupervised learnings, including multivariate survival analysis, to investigate the prognostic power of KGs. Figure 7A shows that KGs can separate case and control samples accurately by the unsupervised hierarchical clustering (HC). The multivariate survival curves, based on the expressions of 11 KGs, separated the low and high-risk groups significantly (Fig. 7B). In the case of supervised learning, at first, we considered the expression profiles of 11 KGs from three datasets (GSE9348, GSE23878 and GSE110224) that contained 60 tumors and 50 control samples in total. Then we partitioned these datasets in to training (70%) and test (30%) sets. Then we trained one popular classifier known as random forest (RF). To test the prediction performance of the model, we also considered the expressions of 11 KGs from another two dataset GSE35279 and TCGA as the independent test set. Figure 7C showed the ROC curves based on the train, test performance, and independent test dataset of RF prediction model. The AUC values (area under the ROC curve) for RF were 1.00 with train data, 0.988 with test data, 0.943 with independent test data and 0.90 with TCGA dataset. Thus, both prediction models based on RF classifiers showed good performance for each of the dependent and independent test datasets of KGs.

Exploring CRC-causing GO and KEGG pathway terms that are associated with cDEGs including KGs

The GO functional enrichment analysis of showed that 185 GO-BP terms, 9 GO-CC terms and 38 GO-MF terms are enriched by the cDEGs genes, where KGs were involved with 57 BPs, 6 CCs and 21 MFs. Among the enriched GO functions including KGs, 6 GO-BP terms (GO:0034,31 ~ cell junction maintenance, GO:0098742 ~ cell–cell adhesion via plasma-membrane adhesion molecules, GO:0045216 ~ cell–cell junction organization, GO:0008285 ~ negative regulation of cell population proliferation, GO:0030334 ~ regulation of cell migration, and GO:0048565 ~ digestive tract development), 5 GO-CC terms (GO:0046658 ~ anchored component of plasma membrane, GO:0062023 ~ collagen-containing extracellular matrix, GO:0005923 ~ bicellular tight junction, GO:0043296 ~ apical junction complex, and GO:0005911 ~ cell–cell junction) and 6 GO-MF terms (GO:0005179 ~ hormone activity, GO:0030250 ~ guanylate cyclase activator activity, GO:0048018 ~ receptor ligand activity, GO:0005254 ~ chloride channel activity, GO:0008237 ~ metallopeptidase activity, and GO:0045236 ~ CXCR chemokine receptor binding) were reported by other researchers as the BPs of CRC (see Table 3 and discussion section for more details). The KEGG pathway enrichment analysis of cDEGs showed that 8 pathways are enriched by the KGs. Among them, KGs involving Nitrogen metabolism, Proximal tubule bicarbonate reclamation, Cell adhesion molecules, Pathogenic Escherichia coli infection, Human T-cell leukemia virus 1 infection, Amoebiasis, Leukocyte transendothelial migration, and Cytokine-cytokine receptor interaction was also reported by other researchers as the pathways of CRC development (see Table 5 and discussion section for more details as before).

Table 5 Top Enriched gene ontology (GO) terms and KEGG pathways by the proposed cDEGs highlighting cKGs

Full size table

Regulatory network analysis of KGs

We constructed KGs versus transcription factors (KGs-TFs) interaction network to identify top ranking few TFs as the key transcriptional regulators of KGs. We selected the top 4 key TFs (FOXC1, YY1, GATA2 and NFKB1) as the vital transcriptional regulators of KGs with degree ≥ 4, where the green color rectangle indicates top degree key TFs and, red and black color ellipse indicates KGs (Fig. 8A). To identify top ranking few micro-RNA (miRNA) as the key post-transcriptional regulators of KGs, we constructed a KGs-miRNAs interaction network. We selected the top 8 key miRNAs (hsa-mir-16-5p, hsa-mir-195-5p, hsa-mir-203a-3p, hsa-mir-34a-5p, hsa-mir-107, hsa-mir-27a-3p, hsa-mir-429, and hsa-mir-335-5p) as the vital regulators of KGs with degree ≥ 4, where green color rectangle indicates top degree key miRNAs and, red and black color ellipse indicates KGs (Fig. 8B).

Exploring candidate drug agents by molecular docking simulation

To explore candidate drugs for CRC, we considered 11 KGs based proteins (CXCL8, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, CEMIP, ZG16, CLCA4, MS4A12 and CLDN1) and its regulatory key 4 TFs proteins (FOXC1, YY1, GATA2 and NFKB1) as the m = 15 drug target receptors. The 3-Dimension (3D) structure of CXCL8, MMP7, ZG16, CA4, YY1 and NFKB1 were downloaded from Protein Data Bank (PDB) with the PDB codes 6N2U, 1MMQ, 3APA, 5KU6, 4C5I and 1NFI and the rest of them, such as GUCA2A, GUCA2B, CLDN1, CLCA4, MS4A12, FOXC1 and GATA2 targets were downloaded from SWISS-MODEL using UniProt with IDs Q02747, Q16661, O95832, Q14CN2, Q9NXJ0, Q12948, and P23769 respectively. Then we considered 92 meta-drug molecules from the DSigDB database and 64 meta-drugs from the published articles and the Food and Drug Administration (FDA) as drug agents. The 3D structures of drug agents were downloaded from the PubChem database. Then we performed a molecular docking simulation between our proposed receptors and meta-drug agents. The binding affinity score matrix between the ordered receptors and ordered drug agents were displayed in Fig. 9A. We observed that Cyclosporin A produces highly significant binding affinity scores with all m = 15 target proteins, and their average binding affinity scores across all targets were − 9.46 (kcal/mol). The 2th and 3th top ordered drugs (Manzamine A and Cardidigin) produced highly significant binding affinity scores with 14 target proteins, and their average binding affinity scores across all m = 15 targets were − 8.22 and − 8.19, respectively. The 4th to 10th top ordered drug Staurosporine, Benzo[A] Pyrene, Sitosterol, Nocardiopsis Sp, Troglitazone, K-252a, and Riccardin D produced significant binding affinity scores with 14 target proteins, and the average binding affinity score was − 7.76, − 7.71, − 7.69, − 7.68, − 7.66, − 7.64, and − 7.62 respectively. The other drugs (lead compounds) produced significant binding affinity scores with less than 13 target proteins out of 15, and their average binding affinity scores were negatively smaller than − 7.5. Therefore, we considered the top ordered nine drugs (Cyclosporin A, Manzamine A, Cardidigin, Staurosporine, Benzo[A]Pyrene, Sitosterol, Nocardiopsis Sp, Troglitazone and Riccardin D) as the candidate drugs in our study. We also examined their complete interaction profile, including hydrogen bonds, hydrophobic, halogen/ salt Bridge and electrostatic interactions in Table 6.

Table 6 The 3-dimension view of strong binding interactions between targets and drugs is shown in the 4th column

Full size table

Performance investigation of proposed drugs by cross-validation with the top-ranked independent receptors

To investigate the resistance performance of our proposed 9 candidate drugs against the state-of-the-art alternative receptors for CRC by molecular docking, we considered the top-ranked 8 independent receptors (MYC, CDK1, CXCL1, CXCL8, CXCL12, TIMP1, AURKA, and TOP2A) published by others in different 36 articles for CRC (see Additional file 1: Table S1), where the receptor CXCL8 was common with our proposed receptor. The 3D structure of MYC, CDK1, CXCL12, TIMP1, AURKA, and TOP2A was downloaded from the PDB database with the PDB codes 6G6K, 6GU2, 6SHR, 2J0T, 6VPM, AND 1ZXM, respectively and for another one CXCL1, downloaded from SWISS-MODEL using UniProt with ID P09341. The we performed molecular docking analysis of 7 independent receptors with all of 160 drug agents. Figure 9B showed that our proposed 9 candidate drugs are also detected as the independent receptor-guided top-ranked 9 drugs. Therefore, we can strongly recommend that the proposed drugs might be more effective candidates than the other drugs for the treatment against CRC.

Connectivity map (CMap) analysis to discover the mechanism of action of drug agents

In an effort to elucidate its mechanism of action, we defined a signature for Troglitazone, Cardidigin and Staurosporine. High connectivity scores were found for multiple instances of five heat shock protein inhibitors: Angiotensin receptor antagonist, Topoisomerase inhibitor, Glycogen synthase kinase inhibitor, DNA dependent protein kinase inhibitor, and MTOR inhibitor. Despite the differences in the cells used to generate the query signature and reference profiles, the three highest-scoring compounds in the Con nectivity Map were Troglitazone, Cardidigin (Digitoxin use this name to use in Cmap) and Staurosporine (Fig. 10A). More important, the Connectivity Map also revealed strong connectivity with ten structurally distinct compounds, mocetinostat, ryuvidine, cyclopamine, dorsomorphin, JNJ-7706621, quinoclamine, SU-11652, bisacodyl, alvocidib, and rottlerin respective inhibitor are show in Fig. 10B. Cyclopamine and alvocidib compounds are not connected with Troglitazone.

AK3, OAZ2, NDUFAF4, EGFR, GNE, MAPK14, CSNK1G2, HSP90AB1, ZNF449, and GATAD1 genes depicts high positive connectivity with each of the drugs Troglitazone, Cardidigin and Staurosporine and their median connectivity score belongs to 97.96–97.53 (out of ± 100) which display in the Fig. 10C with corresponding enriched pathways of the connected gene. Moreover, the drug staurosporine was positively connected with 6 other genes namely OTUD3, TCF7L1, C2, TAAR1, PRDM1, and BMP2 with corresponding enriched pathways.

Discussion

The molecular mechanism of colorectal cancer (CRC) is not yet completely clear to the researchers. So potential molecular signatures are required to disclose molecular mechanisms of CRC and its therapeutic agents. The integrated statistics and bioinformatics analyses are now widely using to explore potential molecular signatures of malignant tumors [108]. Transcriptomics analysis is a popular way of identifying DEGs between normal and tumor tissue samples [109]. Therefore, in this study, we considered the integrated bioinformatics analyses for exploring common genomic biomarkers from five transcriptomics profiles (GSE9348, GSE35279, GSE23878, GSE110224 and GSE50760) for diagnosis, prognosis, and therapies of CRC. At first, we identified 11 KGs (CXCL8, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, CEMIP, ZG16, CLCA4, MS4A12, and CLDN1) by PPI analysis of 50 common DEGs (cDEGs). Some literature reviews also agreed with our results that these KGs are associated with CRC [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45, 54] (see Fig. 11A). For example, Li et al. [110] reported that the gene CXCL8 plays a vital role in CRC progression by mediating the differentiation, proliferation, and apoptosis within a regulatory network. So, they suggested this gene as a drug target for CRC also [110]. Chen and Ke [111] detected the gene MMP7 as a potential biomarker of CRC by bioinformatics analysis. Another study found that it regulates cancer progression and mediate the differentiation, proliferation, invasion and metastasis of various cancer cell types by different mechanisms [112]. A study reported that CA4 is a newly identified tumor suppressor gene in CRC by targeting the WTAP–WT1–TBL1 axis through the inhibition of the Wnt signaling pathway [113]. The gene ADH1C might lead the increasing production of proinflammatory mediators by decreasing its expressions in the ulcerative colitis colon through the activation of the STAT1/NF-κB signaling pathway [114].

The abnormally expressed peptide hormones GUCA2A and GUCA2B play as paracrine endogenous ligands for the guanylate cyclase-C (GUCY2C) receptor and help for the development of tumors in CRC by the association comparatively in lower levels with the disrupted intestinal homeostasis [115]. CEMIP is an adaptor protein of the O-GlcNAc transferase that can be reprogramming the glutamine metabolism through the reciprocal regulation of β-catenin and thereby promotes CRC metastasis. So, CRC metastasis can be prevented by the combinational inhibition of CEMIP and glutamine metabolism that would be a useful therapeutic strategy [116]. ZG16 can modulate the immune response in CRC by blocking the PD-L1 expression and the strong correlation designate ZG16 as a biomarker for the stratification of patent of immunotherapy [117]. Cancer-associated fibroblasts exosomes decreased the sensitivity of CRC towards the radiation and over-expressed miR-590-3p that promote CLCA4-dependent PI3K/Akt signaling pathway as well as cancer cell survival [118]. MS4A12 gene belonging to the MS4A family encodes a protein found in the apical membrane of colonocytes that plays an important role in the differentiation, proliferation, and cell cycle regulation and is believed to be a risk classification marker for early-stage colon cancer [119]. Upregulation of CLDN1 expression was observed in patients with colorectal cancer, which could be a possible biomarker for colorectal cancer treatment [120].

Moreover, different pertinent bioinformatic analyses based on independent databases significantly supported the relationship of KGs with the CRC progression as discussed below. The expression analysis with box-plots by GEPIA web-tool with TCGA RNA-seq data showed that KGs significantly separated CRC groups from the control groups (Fig. 5). We investigated the relationship of tumor infiltrates immune cells with the KGs and observed that KGs are significantly associated with different tumor infiltrates immune cells under different databases of CRC (Additional files 3 and 4). We investigated the DNA methylation of KGs at CpG sites by MethSurv web-tool with TCGA database and bserved that seven KGs (CEMIP, MMP7, CA4, GUCA2B, ZG16, CLCA4, MS4A12) are significantly methylated at CpG sites (Table 5) that may play the vital role in CRC progression. To investigate the prognostic power of KGs, we performed multivariate survival analysis and developed a prediction model through RF classifiers in Fig. 7C. Our developed prediction models showed good performance with both training and test datasets generated from the main data collected from NCBI with accession numbers GSE9348, GSE23878, and GSE110224. The AUC values were 1.000 for the training dataset and 0.988 for the test dataset for RF model. To investigate their performance unbiasedly, we also considered independent test datasets from other NCBI sources with accession numbers GSE35279, respectively. We observed that predictor show good performance with the independent test data and TCGA dataset. The values of AUC were 0.943 and 0.90 for independent test data and TCGA dataset based on RF model. These results indicate the good prediction performance for the identified KGs, so we suggested the prognostic model for the classifier (RF). The GO functional and KEGG pathway enrichment analyses of cDEGs significantly revealed some GO terms of BPs, MFs and CCs, and KEGG pathways by involving KGs that are highly linked with CRC patients (see Table 2). Our literature review also supported their link with CRC. As for examples with the enriched BPs, six GO terms cell junction maintenance [88], cell–cell adhesion [89], calcium-independent cell–cell adhesion [89], and cell–cell junction organization [89] these is associated with one KG (CLDN1). GO terms negative regulation of cell population proliferation [90] (associated with CXCL8), regulation of cell migration [91] (associated with MMP7 and CLDN1) and (associated with CA4) were reported as important BPs for CRC progression. Among the enriched MFs, two GO terms hormone activity [92] and receptor ligand activity [94] were associated with GUCA2A. The guanylate cyclase activator activity [93] were associated with GUCA2A, GUCA2B. The MFs terms chloride channel activity [95] was associated with CLCA4. The metallopeptidase activity [96] was associated with MMP7. Among the enriched CCs, anchored component of plasma membrane [98] was associated with CA4. collagen-containing extracellular matrix [99] was associated with ZG16. Four CCs term bicellular tight junction [100] apical junction complex [101] and cell–cell junction [102] were associated with CLCA1. Among the enriched KEGG pathways, two KEGG terms Nitrogen metabolism [22] and Proximal tubule bicarbonate reclamation [103] were associated with CA4. Three pathways Cell adhesion molecules [104], Pathogenic Escherichia coli infection [105] and Leukocyte transendothelial migration [106] were associated with CLDN1. Two pathways Pathogenic Escherichia coli infection [105], Amoebiasis [44], and Cytokine-cytokine receptor interaction [107] were associated with CXCL8.

The KGs-TFs interaction network analysis indicated that 4 TFs proteins (FOXC1, YY1, GATA2 and NFKB1) are the key transcriptional regulatory factors of KGs (see Fig. 4A). Among them, FOXC1 (a regulator of CA4, ADH1C, GUCA2A, GUCA2B, ZG16, and CLCA4) is connected with lymphatic vessel formation, arterial cell specification, and cardiovascular development [121]. The expression of TF-protein YY1 (a regulator of CXCL8, ADH1C, CEMIP, ZG16, and CLCA4) contributes to tumor growth differs in different cancers [122]. The TF-protein GATA2 (a regulator of GUCA2B, MMP7, CXCL8 and MS4A12) is connected with Hematopoietic and immune defects [123]. The TF-protein NFKB1 (a regulator of GUCA2A, CA4, CEMIP and MS4A12) is a suppressor of inflammation, ageing and cancer [124]. We also constructed the proteins-disease interaction network to detect other diseases connected with the proposed target proteins. Total 9 target proteins out of 11 were associated with other 547 diseases that can be considered as the risk factors of CRC. Especially, two diseases, "Malignant tumor of colon" and "Colonic Neoplasms", were mostly related to our target proteins.

To explore our proposed KGs-guided new and repurposable candidate drugs for the treatment against CRC, we considered the proposed KGs based 11 key proteins (CXCL8, MMP7, CA4, ADH1C, GUCA2A, GUCA2B, CEMIP, ZG16, CLCA4, MS4A12 and CLDN1) and their regulatory 4 TFs proteins (FOXC1, YY1, GATA2 and NFKB1) as the drug target receptors and performed their docking simulation with 167 drug molecules collected from the DSigDB database and published articles (Fig. 9A). Then we selected top-ranked 10 drugs (Cyclosporin A, Manzamine A, Cardidigin, Staurosporine, Benzo[A]Pyrene, Sitosterol, Nocardiopsis Sp, Troglitazone, K-252a and Riccardin D) as the most probable repurposable candidate drugs for CRC patients based on their strong binding affinity scores (kCal/mol) with all the target proteins (Fig. 9A, B). Then we investigated the resistance performance of both the proposed and already published candidate drugs against the state-of-the-art alternatives of already published top-ranked 7 independent receptors for CRC and observed that our proposed candidate drugs are more effective compared to the already published drugs against the independent receptors also (Fig. 11B). We also tried to validate our proposed drugs in favor of CRC by the literature review (Fig. 11B).

Among the identified candidate drugs Cyclosporin A, a calcineurin inhibitor, traditionally used for its immunosuppressive effects, inhibits the activity of the noncanonical Wnt/Ca⁺⁺/NFAT signaling pathway [125, 126]. It has been reported that Manzamine A exhibits an antiproliferative effect on human colorectal carcinoma cells and displays broad effects on gene expression to downregulate fundamental maintenances of cell survival and induce apoptotic cell death and EMT inactivation [127]. This study demonstrates the efficacy of Cardidigin (digitoxin) against cervical cancer in vivo and elucidates its molecular mechanisms, including DSBs, cell cycle arrest and mitochondrial apoptosis. These results will contribute to the development of Cardidigin as a chemotherapeutic agent in the treatment of cervical cancer [128]. Staurosporine alleviates cisplatin chemoresistance in human cancer (colon) cell models by suppressing the induction of SQSTM1/p62 [129]. Ajayi et al. [130] showed that Benzo[A] Pyrene induces oxidative stress, pro-inflammatory cytokines, expression of nuclear factor-kappa B, and deregulation of Wnt/β-catenin signaling in colons of exposed mice. Sitosterol (Beta-sitosterol) suppresses tumor growth without toxicity in AGS xenograft mouse models and induces apoptosis in human gastric adenocarcinoma cells [131]. Sitosterol prevents lipid peroxidation and improves antioxidant status and histoarchitecture in rats with 1,2-dimethylhydrazine-induced colon cancer [132]. The marine actinobacterium Nocardiopsis sp. MBRC-48 is an excellent microbial resource for the biosynthesis of gold nanoparticles with various biomedical applications such as antimicrobial, antioxidant, and anticancer activities [133]. The anti-proliferative and apoptotic activities of PDT in combination with the PPARγ ligand troglitazone and provide a strong rationale for testing the therapeutic potential of combination treatment in colon cancer [134]. Liu et al. showed that Riccardin D might inhibit cell proliferation and induce apoptosis in HT-29 cells, which may be associated with the blocking of the NF-κB signaling pathway [135]. Among the proposed nine candidate drugs, Cyclosporin A, Cardidigin, and Troglitazone are approved by the FDA for a different disease, the three other drugs (Staurosporine, Sitosterol and Nocardiopsis Sp.) are still investigational, and the rest of the three drugs (Manzamine A, Benzo[A]Pyrene, and Riccardin D) are not yet approved. The approved drugs for different diseases and unapproved drugs should be further assessed at the molecular level by the wet-lab experiments prior to clinical investigation in the treatment of CRC.

Conclusion

The main purpose of this study was to identify key genomic biomarkers from multiple gene expression profiles for diagnosis, prognosis and therapies of CRC by using integrated bioinformatics and statistical approaches. We identified 11 common key genes (KGs) from multiple transcriptomics datasets, where 4 KGs (CXCL8, CEMIP, MMP7,and CLDN1) were up-regulated and the rest 7 KGs (CA4, ADH1C, GUCA2A, GUCA2B, ZG16, CLCA4, and MS4A12) were downregulated. Different pertinent bioinformatic analyses including box plots of KGs-expressions with CRC and control groups, multivariate survival probability curves based on KGs-expressions, DNA methylation of KGs, correlation of KGs with immune infiltration levels in CRC, (different diseases)-KGs interaction, CRC-causing GO and KEGG pathways based on independent databases significantly supported the relationship of KGs with the CRC progression. Their association was also supported by several other independent studies directly or indirectly that we mentioned in the discussion section. We detected four TFs proteins (FOXC1, YY1, GATA2 and NFKB) and eight microRNAs (hsa-mir-16-5p, hsa-mir-195-5p, hsa-mir-203a-3p, hsa-mir-34a-5p, hsa-mir-107, hsa-mir-27a-3p, hsa-mir-429, and hsa-mir-335-5p) as the key transcriptional and post-transcriptional regulators that may play a vital role in the regulation of KGs. Then we considered the proposed 11 key proteins and their regulatory 4 TFs-proteins as the drug target receptors to explore effective drug agents for CRC by molecular docking simulation with the 156 meta-drug agents. We detected nine small molecules (Cyclosporin A, Manzamine A, Cardidigin, Staurosporine, Benzo[A]Pyrene, Sitosterol, Nocardiopsis Sp, Troglitazone, and Riccardin D) as the top-ranked candidate drugs for the treatment against CRC. Then we investigated the resistance performance of the proposed drugs against the state-of-the-art already published top-ranked 11 independent receptors for CRC and observed that our proposed repurposable candidate drugs are more effective compared to the already published drugs against the independent receptors also. Therefore, the proposed candidate drugs might be played a vital role in the treatment of CRC.

Availability of data and materials

The necessary R-codes and the datasets analyzed in this study is available at the web link: https://github.com/Horaira29/crc.git. (We collected these datasets from the National Center of Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database with the accession numbers GSE9348, GSE110224, GSE23878, GSE35279 and GSE50760 using the web links. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9348, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE110224, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE23878, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE35279, and https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE50760 respectively. We collected meta-drug agents from the online database DSigDB [56] with respect to the proposed receptors and FDA approved repurposed drugs for the treatment of NSCLC patients https://www.cancer.gov/about-cancer/treatment/drugs/colorectal).

References

WHO. Cancer, fact sheet, World Health Organization. Newsroom. 2018. https://www.who.int/news-room/fact-sheets/detail/cancer. Accessed 18 Mar 2021.
Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108.
Article PubMed Google Scholar
Arnold M, Sierra MS, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global patterns and trends in colorectal cancer incidence and mortality. Gut. 2017;66:683–91.
Article PubMed Google Scholar
Kijima S, Sasaki T, Nagata K, Utano K, Lefor AT, Sugimot H. Preoperative evaluation of colorectal cancer using CT colonography, MRI, and PET/CT. World J Gastroenterol. 2014;20:16964–75.
Article PubMed PubMed Central Google Scholar
Brenner H, Kloor M, Pox CP. Colorectal cancer. In: The Lancet. Elsevier B.V.; 2014. p. 1490–502.
Ohhara Y, Fukuda N, Takeuchi S, Honma R, Shimizu Y, Kinoshita I, et al. Role of targeted therapy in metastatic colorectal cancer. World J Gastrointest Oncol. 2016;8:642.
Article PubMed PubMed Central Google Scholar
Watanabe T, Muro K, Ajioka Y, Hashiguchi Y, Ito Y, Saito Y, et al. Japanese Society for Cancer of the Colon and Rectum (JSCCR) guidelines 2016 for the treatment of colorectal cancer. Int J Clin Oncol. 2018;23:1–34.
Article PubMed Google Scholar
Santiago JA, Bottero V, Potashkin JA. Dissecting the molecular mechanisms of neurodegenerative diseases through network biology. Front Aging Neurosci. 2017;9:166.
Article PubMed PubMed Central Google Scholar
Rahman MR, Islam T, Turanli B, Zaman T, Faruquee HM, Rahman MM, et al. Network-based approach to identify molecular signatures and therapeutic agents in Alzheimer’s disease. Comput Biol Chem. 2019;78:431–9.
Article CAS PubMed Google Scholar
Islam T, Rahman R, Gov E, Turanli B, Gulfidan G, Haque A, et al. Drug targeting and biomarkers in head and neck cancers: insights from systems biology analyses. Omi A J Integr Biol. 2018;22:422–36.
Article CAS Google Scholar
Shahjaman M, Rezanur Rahman M, Shahinul Islam SM, Mollah MNH. A robust approach for identification of cancer biomarkers and candidate drugs. Medicina. 2019;55:269.
Article PubMed PubMed Central Google Scholar
Moni MA, Islam MB, Rahman MR, Rashed-Al-Mahfuz M, Awal MA, Islam SMS, et al. Network-based computational approach to identify delineating common cell pathways influencing type 2 diabetes and diseases of bone and joints. IEEE Access. 2020;8:1486–97.
Article Google Scholar
Mosharaf MP, Reza MS, Kibria MK, Ahmed FF, Kabir MH, Hasan S, et al. Computational identification of host genomic biomarkers highlighting their functions, pathways and regulators that influence SARS-CoV-2 infections and drug repurposing. Sci Rep. 2022;12:4279.
Article CAS PubMed PubMed Central Google Scholar
Dong S, Ding Z, Zhang H, Chen Q. Identification of prognostic biomarkers and drugs targeting them in colon adenocarcinoma: a bioinformatic analysis. Integr Cancer Ther. 2019;18:1534735419864434.
Article CAS PubMed PubMed Central Google Scholar
Rahman F, Mahmud P, Karim R, Hossain T, Islam F. Determination of novel biomarkers and pathways shared by colorectal cancer and endometrial cancer via comprehensive bioinformatics analysis. Inform Med Unlocked. 2020;20:100376.
Article Google Scholar
Dai GP, Wang LP, Wen YQ, Ren XQ, Zuo SG. Identification of key genes for predicting colorectal cancer prognosis by integrated bioinformatics analysis. Oncol Lett. 2020;19:388–98.
CAS PubMed Google Scholar
Guo Y, Bao Y, Ma M, Yang W. Identification of key candidate genes and pathways in colorectal cancer by integrated bioinformatical analysis. Int J Mol Sci. 2017;18:722.
Article PubMed PubMed Central Google Scholar
Ding X, Duan H, Luo H. Identification of core gene expression signature and key pathways in colorectal cancer. Front Genet. 2020;11:45.
Article CAS PubMed PubMed Central Google Scholar
Yuan Y, Chen J, Wang J, Xu M, Zhang Y, Sun P, et al. Identification hub genes in colorectal cancer by integrating weighted gene co-expression network analysis and clinical validation in vivo and vitro. Front Oncol. 2020;10:638.
Article PubMed PubMed Central Google Scholar
Liang B, Li C, Zhao J. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis. Med Oncol. 2016;33:1–8.
Article CAS Google Scholar
Tang L, Lei YY, Liu YJ, Tang B, Yang SM. The expression of seven key genes can predict distant metastasis of colorectal cancer to the liver or lung. J Dig Dis. 2020;21:639–49.
Article CAS PubMed Google Scholar
Hozhabri H, Lashkari A, Razavi SM, Mohammadian A. Integration of gene expression data identifies key genes and pathways in colorectal cancer. Med Oncol. 2021;38:1–14.
Article Google Scholar
Berg KCG, Eide PW, Eilertsen IA, Johannessen B, Bruun J, Danielsen SA, et al. Multi-omics of 34 colorectal cancer cell lines—a resource for biomedical studies. Mol Cancer. 2017;16:1–16.
Article CAS Google Scholar
Qi Y, Qi H, Liu Z, He P, Li B. Bioinformatics Analysis of key genes and pathways in colorectal cancer. J Comput Biol. 2019;26:364–75.
Article CAS PubMed Google Scholar
Chen Z, Lin Y, Gao J, Lin S, Zheng Y, Liu Y, et al. Identification of key candidate genes for colorectal cancer by bioinformatics analysis. Oncol Lett. 2019;18:6583–93.
CAS PubMed PubMed Central Google Scholar
Xu H, Ma Y, Zhang J, Gu J, Jing X, Lu S, et al. Identification and verification of core genes in colorectal cancer. Biomed Res Int. 2020;2020.
Mastrogamvraki N, Zaravinos A. Signatures of co-deregulated genes and their transcriptional regulators in colorectal cancer. Npj Syst Biol Appl. 2020;6:23.
Article CAS PubMed PubMed Central Google Scholar
Liu X, Bing Z, Wu J, Zhang J, Zhou W, Ni M, et al. Integrative gene expression profiling analysis to investigate potential prognostic biomarkers for colorectal cancer. Med Sci Monit. 2020;26:e918906–11.
CAS PubMed PubMed Central Google Scholar
Pirim D. Integrative analyses of molecular pathways and key candidate biomarkers associated with colorectal cancer. Cancer Biomark. 2020;27:555–68.
Article CAS PubMed Google Scholar
Cui X, Shen K, Xie Z, Liu T, Zhang H. Identification of key genes in colorectal cancer using random walk with restart. Mol Med Rep. 2017;15:867–72.
Article CAS PubMed Google Scholar
Zhu H, Ji Y, Li W, Wu M. Identification of key pathways and genes in colorectal cancer to predict the prognosis based on mRNA interaction network. Oncol Lett. 2019;18:3778–86.
CAS PubMed PubMed Central Google Scholar
Peng WF, Bai F, Shao K, Shen LS, Li HH, Huang S. The key genes underlying pathophysiology association between the type 2-diabetic and colorectal cancer. J Cell Physiol. 2018;233:8551–7.
Article CAS PubMed Google Scholar
Liu S, Zeng F, Fan G, Dong Q. Identification of hub genes and construction of a transcriptional regulatory network associated with tumor recurrence in colorectal cancer by weighted gene co-expression network analysis. Front Genet. 2021;12:649752.
Article CAS PubMed PubMed Central Google Scholar
Asghari M, Abazari MF, Bokharaei H, Aleagha MN, Poortahmasebi V, Askari H, et al. Key genes and regulatory networks involved in the initiation, progression and invasion of colorectal cancer. Futur Sci OA. 2018;4:FSO278.
Article CAS Google Scholar
Lin T, Liang C, Peng W, Qiu Y, Peng L. Mechanisms of core Chinese herbs against colorectal cancer: A study based on data mining and network pharmacology. Evidence-based Complement Altern Med. 2020;2020.
Kasap E, Gerceker E, Boyacıoglu SÖ, Yuceyar H, Yıldırm H, Ayhan S, et al. The potential role of the NEK6, AURKA, AURKB, and PAK1 genes in adenomatous colorectal polyps and colorectal adenocarcinoma. Tumor Biol. 2016;37:3071–80.
Article CAS Google Scholar
Zheng Z, Xie J, Xiong L, Gao M, Qin L, Dai C, et al. Identification of candidate biomarkers and therapeutic drugs of colorectal cancer by integrated bioinformatics analysis. Med Oncol. 2020;37:1–11.
Article Google Scholar
Hameed Y, Usman M, Liang S, Ejaz S. Novel diagnostic and prognostic biomarkers of colorectal cancer: capable to overcome the heterogeneity-specific barrier and valid for global applications. PLoS ONE. 2021;16:e0256020.
Article CAS PubMed PubMed Central Google Scholar
Yang X, Wei W, Tan S, Guo L, Qiao S, Yao B, et al. Identification and verification of HCAR3 and INSL5 as new potential therapeutic targets of colorectal cancer. World J Surg Oncol. 2021;19:248.
Article PubMed PubMed Central Google Scholar
Zhang J, Zhang H, Li F, Song Z, Li Y, Zhao T. Identification of intestinal flora-related key genes and therapeutic drugs in colorectal cancer. BMC Med Genom. 2020;13:1–8.
Article Google Scholar
Rahman MR, Islam T, Gov E, Turanli B, Gulfidan G, Shahjaman M, et al. Identification of prognostic biomarker signatures and candidate drugs in colorectal cancer: insights from systems biology analysis. Medicina. 2019;55:20.
Article PubMed PubMed Central Google Scholar
Chen J, Wang Z, Shen X, Cui X, Guo Y. Identification of novel biomarkers and small molecule drugs in human colorectal cancer by microarray and bioinformatics analysis. Mol Genet Genomic Med. 2019;7:e00713.
Article PubMed PubMed Central Google Scholar
Yu C, Chen F, Jiang J, Zhang H, Zhou M. Screening key genes and signaling pathways in colorectal cancer by integrated bioinformatics analysis. Mol Med Rep. 2019;20:1259–69.
CAS PubMed PubMed Central Google Scholar
Yang J, Gao S, Qiu M, Kan S. Integrated analysis of gene expression and metabolite data reveals candidate molecular markers in colorectal carcinoma. Cancer Biother Radiopharm. 2020. https://doi.org/10.1089/cbr.2020.3980.
Article PubMed PubMed Central Google Scholar
Zhao Z, Fan X, Yang L, Song J, Fang S, Tu J, et al. The identification of a common different gene expression signature in patients with colorectal cancer. Math Biosci Eng. 2019;16:2942–58.
Article PubMed Google Scholar
Ma Y, Wen J, Wang J, Wang C, Zhang Y, Zhao L, et al. Asiaticoside antagonizes proliferation and chemotherapeutic drug resistance in hepatocellular carcinoma (HCC) cells. Med Sci Monit. 2020;26:e924435-1.
Article Google Scholar
Leng X, Yang J, Liu T, Zhao C, Cao Z, Li C, et al. A bioinformatics framework to identify the biomarkers and potential drugs for the treatment of colorectal cancer. Front Genet. 2022;13:2739.
Article Google Scholar
Wang Q, Huang X, Zhou S, Ding Y, Wang H, Jiang W, et al. IL1RN and PRRX1 as a prognostic biomarker correlated with immune infiltrates in colorectal cancer: evidence from bioinformatic analysis. Int J Genom. 2022;2022.
Sharma A, Yadav D, Rao P, Sinha S, Goswami D, Rawal RM, et al. Identification of potential therapeutic targets associated with diagnosis and prognosis of colorectal cancer patients based on integrated bioinformatics analysis. Comput Biol Med. 2022;146:105688.
Article CAS PubMed Google Scholar
Ekanem TI, Tsai WL, Lin YH, Tan WQ, Chang HY, Huang TC, et al. Identification of the effects of aspirin and sulindac sulfide on the inhibition of HMGA2-mediated oncogenic capacities in colorectal cancer. Molecules. 2020;25:3826.
Article CAS PubMed PubMed Central Google Scholar
Li B, Flaveny CA, Giambelli C, Fei DL, Han L, Hang BI, Bai F, Pei XH, Nose V, Burlingame O, Capobianco AJ. Repurposing the FDA-approved pinworm drug pyrvinium as a novel chemotherapeutic agent for intestinal polyposis. PLoS ONE. 2014;9:e101969.
Article PubMed PubMed Central Google Scholar
Sun Z, Liu C, Cheng SY. Identification of four novel prognosis biomarkers and potential therapeutic drugs for human colorectal cancer by bioinformatics analysis. J Biomed Res. 2020;1:1–15.
Google Scholar
Zheng Y, Zhou J, Tong Y. Gene signatures of drug resistance predict patient survival in colorectal cancer. Pharmacogenomics J. 2015;15:135–43.
Article CAS PubMed Google Scholar
Que W, Chen M, Yang L, Zhang B, Zhao Z, Liu M, et al. A network pharmacology-based investigation on the bioactive ingredients and molecular mechanisms of Gelsemium elegans Benth against colorectal cancer. BMC Complement Med Ther. 2021;21:1–18.
Article Google Scholar
Zhang M, Wang D, Lu F, Zhao R, Ye X, He L, et al. Identification of the active substances and mechanisms of ginger for the treatment of colon cancer based on network pharmacology and molecular docking. BioData Min. 2020. https://doi.org/10.21203/rs.3.rs-38111/v1.
Article PubMed PubMed Central Google Scholar
Huang S, Zhang Z, Li W, Kong F, Yi P, Huang J, et al. Network pharmacology-based prediction and verification of the active ingredients and potential targets of Zuojinwan for treating colorectal cancer. Drug Des Dev Ther. 2020;14:2725–40.
Article CAS Google Scholar
Yang F, Cai S, Ling L, Zhang H, Tao L, Wang Q. Identification of a five-gene prognostic model and its potential drug repurposing in colorectal cancer based on TCGA, GTEx and GEO databases. Front Genet. 2021;11:622659.
Article PubMed PubMed Central Google Scholar
L L, Y C, X L, M W, J L, Q X, et al. Upregulation of SNTB1 correlates with poor prognosis and promotes cell growth in colorectal cancer. 2021. https://doi.org/10.21203/RS.3.RS-549083/V1.
Mokgautsi N, Wang Y-C, Lawal B, Khedkar H, Sumitra MR, Wu ATH, et al. Network pharmacological analysis through a bioinformatics approach of novel NSC765600 and NSC765691 compounds as potential inhibitors of CCND1/CDK4/PLK1/CD44 in cancer types. Cancers (Basel). 2021;13:2523.
Article CAS PubMed Google Scholar
Yoo M, Shin J, Kim J, Ryall KA, Lee K, Lee S, et al. DSigDB: drug signatures database for gene set analysis. Bioinformatics. 2015;31:3069–71.
Article CAS PubMed PubMed Central Google Scholar
Smith GK. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:1–25.
Article CAS Google Scholar
Ritchie ME, Phipson B, Wu DI, Hu Y, Law CW, Shi W, Smyth GK. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.
Article PubMed PubMed Central Google Scholar
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–13.
Article CAS PubMed Google Scholar
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.
Article CAS PubMed PubMed Central Google Scholar
Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8:1–7.
Article Google Scholar
Jeong H, Mason SP, Barabási AL, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–2.
Article CAS PubMed Google Scholar
Pržulj N, Wigle DA, Jurisica I. Functional topology in a network of protein interactions. Bioinformatics. 2004;20:340–8.
Article PubMed Google Scholar
Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977;40:35–41.
Article Google Scholar
Shimbel A. Structural parameters of communication networks. Bull Math Biophys. 1953;15:501–7.
Article Google Scholar
Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 2017;45:W98–102.
Article CAS PubMed PubMed Central Google Scholar
Li T, Fu J, Zeng Z, Cohen D, Li J, Chen Q, et al. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 2020;48:W509–14.
Article CAS PubMed PubMed Central Google Scholar
Modhukur V, Iljasenko T, Metsalu T, Lokk K, Laisk-Podar T, Vilo J. MethSurv: a web tool to perform multivariable survival analysis using DNA methylation data. Epigenomics. 2018;10:277–88.
Article CAS PubMed Google Scholar
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–7.
Article CAS PubMed PubMed Central Google Scholar
Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J, Ronzano F, Centeno E, Sanz F, et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48:D845–55.
PubMed Google Scholar
Aguirre-Gamboa R Martı´nez-Ledesma E, Martı´nez-Torteya A, Chacolla-Huaringa R, et al. G-RH. SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis. 2018.
Doms A, Schroeder M. GoPubMed: exploring PubMed with the gene ontology. Nucleic Acids Res. 2005;33(SUPPL):2.
Google Scholar
Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30.
Article CAS PubMed PubMed Central Google Scholar
Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL, et al. Gene set knowledge discovery with Enrichr. Curr Protoc. 2021;1:e90.
Article CAS PubMed PubMed Central Google Scholar
Fornes O, Castro-Mondragon JA, Khan A, Van Der Lee R, Zhang X, Richmond PA, JASPAR, et al. Update of the open-Access database of transcription factor binding profiles. Nucleic Acids Res. 2020;2020:48.
Google Scholar
Zhou G, Soufan O, Ewald J, Hancock REW, Basu N, Xia J. NetworkAnalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res. 2019;47:W234–41.
Article CAS PubMed PubMed Central Google Scholar
Chou CH, Shrestha S, Yang CD, Chang NW, Lin YL, Liao KW, MiRTarBase update, et al. A resource for experimentally validated microRNA-target interactions. Nucleic Acids Res. 2018;2018:46.
Google Scholar
Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, et al. The protein data bank. Acta Crystallogr Sect D Biol Crystallogr. 2002;58(6):899–907.
Article Google Scholar
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2019;47:D1102–9.
Article PubMed Google Scholar
Systèmes D. BIOVIA, discovery studio visualizer, release 2019. San Diego Dassault Systèmes. 2020.
Gruber A, Durham AM, Huynh C, del Portillo HA. Defining and searching for structural motifs using DeepView/Swiss-PdbViewer. BMC Bioinform. 2008;13:1–11.
Google Scholar
Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2009. https://doi.org/10.1002/jcc.21334.
Article Google Scholar
Dallakyan S, Olson AJ. Small-molecule library screening by docking with PyRx. Methods Mol Biol. 2015;1263:243–50.
Article CAS PubMed Google Scholar
Ibáňez Gaspar V, Catozzi S, Ternet C, Luthert PJ, Kiel C. Analysis of Ras-effector interaction competition in large intestine and colorectal cancer context. Small GTPases. 2021;12:209–25.
Article PubMed Google Scholar
Faux MC, King LE, Kane SR, Love C, Sieber OM, Burgess AW. APC regulation of ESRP1 and p120-catenin isoforms in colorectal cancer cells. Mol Biol Cell. 2021;32:120–30.
Article CAS PubMed PubMed Central Google Scholar
Fan Q, Liu B. Identification of the anticancer effects of a novel proteasome inhibitor, ixazomib, on colorectal cancer using a combined method of microarray and bioinformatics analysis. Onco Targets Ther. 2017;10:3591–606.
Article PubMed PubMed Central Google Scholar
Gao Y, Zhang S, Zhang Y, Qian J. Identification of microRNA-target gene-transcription factor regulatory networks in colorectal adenoma using microarray expression data. Front Genet. 2020;11:463.
Article CAS PubMed PubMed Central Google Scholar
Lv J, Li L, Duan B. Hub genes and key pathway identification in colorectal cancer based on bioinformatic analysis. Biomed Res Int. 2019;2019.
Ai D, Wang Y, Li X, Pan H. Colorectal cancer prediction based on weighted gene co-expression network analysis and variational auto-encoder. Biomolecules. 2020;10:1–11.
Article Google Scholar
Xu K, He J, Zhang J, Liu T, Yang F, Ren T. A novel prognostic risk score model based on immune-related genes in patients with stage IV colorectal cancer. Biosci Rep. 2020;40.
Chen TJ, He HL, Shiue YL, Yang CC, Lin LC, Tian YF, et al. High chloride channel accessory 1 expression predicts poor prognoses in patients with rectal cancer receiving chemoradiotherapy. Int J Med Sci. 2018;15:1171–8.
Article CAS PubMed PubMed Central Google Scholar
Wang W, Sun JF, Wang XZ, Ying HQ, You XH, Sun F. A novel prognostic score based on zg16 for predicting crc survival. Pharmgenomics Pers Med. 2020;13:735–47.
CAS PubMed PubMed Central Google Scholar
Zheng W, Yang C, Qiu L, Feng X, Sun K, Deng H. Transcriptional information underlying the generation of CSCs and the construction of a nine-mRNA signature to improve prognosis prediction in colorectal cancer. Cancer Biol Ther. 2020;21:688–97.
Article CAS PubMed PubMed Central Google Scholar
Sugiyama Y, Wakazaki M, Toyooka K, Fukuda H, Oda Y. A Novel plasma membrane-anchored protein regulates xylem cell-wall deposition through microtubule-dependent lateral inhibition of rho GTPase domains. Curr Biol. 2017;27:2522-2528.e4.
Article CAS PubMed Google Scholar
Chu XD, Zhang YR, Lin ZB, Zhao Z, Huangfu SC, Qiu SH, et al. A network pharmacology approach for investigating the multitarget mechanisms of Huangqi in the treatment of colorectal cancer. Transl Cancer Res. 2021;10:681–93.
Article CAS PubMed PubMed Central Google Scholar
Angius A, Uva P, Pira G, Muroni MR, Sotgiu G, Saderi L, et al. Integrated analysis of miRNA and mRNA endorses a twenty miRNAs signature for colorectal carcinoma. Int J Mol Sci. 2019;20:4067.
Article CAS PubMed PubMed Central Google Scholar
Gehren AS, Rocha MR, de Souza WF, Morgado-Díaz JA. Alterations of the apical junctional complex and actin cytoskeleton and their role in colorectal cancer progression. Tissue Barriers. 2015;3:1–12.
Article Google Scholar
Kim WK, Kwon Y, Jang M, Park M, Kim J, Cho S, et al. Β-catenin activation down-regulates cell–cell junction-related genes and induces epithelial-to-mesenchymal transition in colorectal cancers. Sci Rep. 2019;9:1–15.
Google Scholar
Patil AR, Leung MY, Roy S. Identification of hub genes in different stages of colorectal cancer through an integrated bioinformatics approach. Int J Environ Res Public Health. 2021;18:5564.
Article CAS PubMed PubMed Central Google Scholar
Codrich M, Dalla E, Mio C, Antoniali G, Malfatti MC, Marzinotto S, et al. Integrated multi-omics analyses on patient-derived CRC organoids highlight altered molecular pathways in colorectal cancer progression involving PTEN. J Exp Clin Cancer Res. 2021;40:1–17.
Article Google Scholar
Veziant J, Gagnière J, Jouberton E, Bonnin V, Sauvanet P, Pezet D, et al. Association of colorectal cancer with pathogenic Escherichia coli: focus on mechanisms using optical imaging. World J Clin Oncol. 2016;7:293–301.
Article PubMed PubMed Central Google Scholar
Huang C, Ou R, Chen X, Zhang Y, Li J, Liang Y, et al. Tumor cell-derived SPON2 promotes M2-polarized tumor-associated macrophage infiltration and cancer progression by activating PYK2 in CRC. J Exp Clin Cancer Res. 2021;40:1–17.
Article Google Scholar
Singh MP, Rai S, Singh NK, Srivastava S. Transcriptomic landscape of early age onset of colorectal cancer identifies novel genes and pathways in Indian CRC patients. Sci Rep. 2021;11:1–11.
Google Scholar
Wu D, Wang X. Application of clinical bioinformatics in lung cancer-specific biomarkers. Cancer Metastasis Rev. 2015;34:209–16.
Article CAS PubMed Google Scholar
Liang Y, Diehn M, Watson N, Bollen AW, Aldape KD, Nicholas MK, et al. Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme. Proc Natl Acad Sci U S A. 2005;102:5814–9.
Article CAS PubMed PubMed Central Google Scholar
Li J, Liu Q, Huang X, Cai Y, Song L, Xie Q, et al. Transcriptional profiling reveals the regulatory role of CXCL8 in promoting colorectal cancer. Front Genet. 2020;10:1360.
Article PubMed PubMed Central Google Scholar
Chen L, Ke X. MMP7 as a potential biomarker of colon cancer and its prognostic value by bioinformatics analysis. Medicine (Baltimore). 2021;100: e24953.
Article CAS PubMed Google Scholar
Liao HY, Da CM, Liao B, Zhang HH. Roles of matrix metalloproteinase-7 (MMP-7) in cancer. Clin Biochem. 2021;92:9–18.
Article CAS PubMed Google Scholar
Zhang J, Tsoi H, Li X, Wang H, Gao J, Wang K, et al. Carbonic anhydrase IV inhibits colon cancer development by inhibiting the Wnt signalling pathway through targeting the WTAP-WT1-TBL1 axis. Gut. 2016;65:1482–93.
Article CAS PubMed Google Scholar
Song F, Zhang Y, Pan Z, Hu X, Zhang Q, Huang F, et al. The role of alcohol dehydrogenase 1C in regulating inflammatory responses in ulcerative colitis. Biochem Pharmacol. 2021;192:114691.
Article CAS PubMed Google Scholar
Zhang H, Du Y, Wang Z, Lou R, Wu J, Feng J. Integrated analysis of oncogenic networks in colorectal cancer identifies GUCA2A as a Molecular marker. Biochem Res Int. 2019;2019.
Hua Q, Zhang B, Xu G, Wang L, Wang H, Lin Z, et al. CEMIP, a novel adaptor protein of OGT, promotes colorectal cancer metastasis through glutamine metabolic reprogramming via reciprocal regulation of β-catenin. Oncogene. 2021;2021:1–13.
Google Scholar
Meng H, Ding Y, Liu E, Li W, Wang L. ZG16 regulates PD-L1 expression and promotes local immunity in colon cancer. Transl Oncol. 2021;14:101003.
Article CAS PubMed Google Scholar
Chen X, Liu Y, Zhang Q, Liu B, Cheng Y, Zhang Y, et al. Exosomal miR-590-3p derived from cancer-associated fibroblasts confers radioresistance in colorectal cancer. Mol Ther Nucleic Acids. 2021;24:113–26.
Article CAS PubMed Google Scholar
He L, Deng HY, Wang XC. Decreased expression of MS4A12 inhibits differentiation and predicts early stage survival in colon cancer. Neoplasma. 2017;64:65–73.
Article CAS PubMed Google Scholar
Liu W. Long non-coding RNA VPS9D1-AS1 promotes growth of colon adenocarcinoma by sponging miR-1301-3p and CLDN1. Hum Cell. 2021;34:1775–87.
Article CAS PubMed Google Scholar
Kume T. The cooperative roles of Foxcl and Foxc2 in cardiovascular development. Adv Exp Med Biol. 2009;665:63–77.
Article CAS PubMed Google Scholar
Sarvagalla S, Kolapalli SP, Vallabhapurapu S. The two sides of YY1 in cancer: a friend and a foe. Front Oncol. 2019;9:1230.
Article PubMed PubMed Central Google Scholar
Collin M, Dickinson R, Bigley V. Haematopoietic and immune defects associated with GATA2 mutation. Br J Haematol. 2015;169:173–87.
Article CAS PubMed PubMed Central Google Scholar
Cartwright T, Perkins ND, L. Wilson C. NFKB1: a suppressor of inflammation, ageing and cancer. FEBS J. 2016;283:1812–22.
Article CAS PubMed Google Scholar
Kim J, Kim DW, Chang W, Choe J, Kim J, Park C-S, et al. Wnt5a is secreted by follicular dendritic cells to protect germinal center B cells via Wnt/Ca²⁺/NFAT/NF-κB–B Cell lymphoma 6 signaling. J Immunol. 2012;188:182–9.
Article CAS PubMed Google Scholar
Gregory MA, Phang TL, Neviani P, Alvarez-Calderon F, Eide CA, O’Hare T, et al. Wnt/Ca²⁺/NFAT signaling maintains survival of Ph⁺ leukemia cells upon inhibition of Bcr-Abl. Cancer Cell. 2010;18:74–87.
Article CAS PubMed PubMed Central Google Scholar
Lin LC, Kuo TT, Chang HY, Liu WS, Hsia SM, Huang TC. Manzamine a exerts anticancer activity against human colorectal cancer cells. Mar Drugs. 2018;16:252.
Article PubMed PubMed Central Google Scholar
Gan H, Qi M, Chan C, Leung P, Ye G, Lei Y, et al. Digitoxin inhibits HeLa cell growth through the induction of G2/M cell cycle arrest and apoptosis in vitro and in vivo. Int J Oncol. 2020;57:562–73.
Article CAS PubMed Google Scholar
Alsamman K, El-Masry OS. Staurosporine alleviates cisplatin chemoresistance in human cancer cell models by suppressing the induction of SQSTM1/p62. Oncol Rep. 2018;40:2157–62.
CAS PubMed Google Scholar
Ajayi BO, Adedara IA, Farombi EO. Benzo(a)pyrene induces oxidative stress, pro-inflammatory cytokines, expression of nuclear factor-kappa B and deregulation of wnt/beta-catenin signaling in colons of BALB/c mice. Food Chem Toxicol. 2016;95:42–51.
Article CAS PubMed Google Scholar
Shin EJ, Choi HK, Sung MJ, Park JH, Chung MY, Chung S, et al. Anti-tumour effects of beta-sitosterol are mediated by AMPK/PTEN/HSP90 axis in AGS human gastric adenocarcinoma cells and xenograft mouse models. Biochem Pharmacol. 2018;152:60–70.
Article CAS PubMed Google Scholar
Baskar AA, AlNumair KS, GabrielPaulraj M, Alsaif MA, Muamar MA, Ignacimuthu S. Β-sitosterol prevents lipid peroxidation and improves antioxidant status and histoarchitecture in rats with 1,2-dimethylhydrazine-induced colon cancer. J Med Food. 2012;15:335–43.
Article CAS PubMed Google Scholar
Manivasagan P, Alam MS, Kang KH, Kwak M, Kim SK. Extracellular synthesis of gold bionanoparticles by Nocardiopsis sp. and evaluation of its antimicrobial, antioxidant and cytotoxic activities. Bioprocess Biosyst Eng. 2015;38:1167–77.
Article CAS PubMed Google Scholar
Park H, Ko SH, Lee JM, Park JH, Choi YH. Troglitazone enhances the apoptotic response of DLD-1 colon cancer cells to photodynamic therapy. Yonsei Med J. 2016;57:1494–9.
Article CAS PubMed PubMed Central Google Scholar
Liu H, Li G, Zhang B, Sun D, Wu J, Chen F, et al. Suppression of the NF-κB signaling pathway in colon cancer cells by the natural compound Riccardin D from Dumortierahirsute. Mol Med Rep. 2018;17:5837–43.
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We would like to acknowledge both editors and reviewers for their valuable comments and suggestions that help us to improve the quality of the manuscript.

Funding

This work was completed without funding.

Author information

Authors and Affiliations

Bioinformatics Lab, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
Md Abu Horaira, Md. Ariful Islam, Md. Kaderi Kibria, Md. Jahangir Alam & Md. Nurul Haque Mollah
Department of Biochemistry and Molecular Biology, University of Rajshahi, Rajshahi, 6205, Bangladesh
Syed Rashel Kabir

Authors

Md Abu Horaira
View author publications
You can also search for this author in PubMed Google Scholar
Md. Ariful Islam
View author publications
You can also search for this author in PubMed Google Scholar
Md. Kaderi Kibria
View author publications
You can also search for this author in PubMed Google Scholar
Md. Jahangir Alam
View author publications
You can also search for this author in PubMed Google Scholar
Syed Rashel Kabir
View author publications
You can also search for this author in PubMed Google Scholar
Md. Nurul Haque Mollah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MAH and MNHM conceived the idea of the study. MAH, MAI and MKK jointly analyzed gene expression profiles using different statistical and bioinformatics tools. MAH performed molecular docking and drafted the manuscript. MJA, SRK and MNHM edited the manuscript and provided important suggestions and MNHM supervised the project. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Md. Nurul Haque Mollah.

Ethics declarations

Ethics approval and consent to participate

All methods in the study were performed in accordance with the relevant guidelines.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Table S1.

Different lists of hub-genes [HubGs] for colorectal cancer [CRC] published in different articles. Table S2. Association of cKGs with other disease risks.

Additional file 2.

The details of common differentially expressed genes (cDEGs) for the five GEO datasets (GSE110224, GSE50760, GSE35279, and GSE23878) including Gene.symbol, ID, adj.P.Val, P.Value, t, B, logFC, and Gene.title.

Additional file 3.

Correlation plots illustrating the relationship between gene expression and immune infiltration levels across multiple types of cancers, using the bioinformatics tool TIMER 2.0.

Additional file 4.

Correlation between key genes (KGs) and immune infiltration levels in colorectal cancers (CRC).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Horaira, M.A., Islam, M.A., Kibria, M.K. et al. Bioinformatics screening of colorectal-cancer causing molecular signatures through gene expression profiles to discover therapeutic targets and candidate agents. BMC Med Genomics 16, 64 (2023). https://doi.org/10.1186/s12920-023-01488-w

Download citation

Received: 12 April 2022
Accepted: 14 March 2023
Published: 29 March 2023
DOI: https://doi.org/10.1186/s12920-023-01488-w

Bioinformatics screening of colorectal-cancer causing molecular signatures through gene expression profiles to discover therapeutic targets and candidate agents

Abstract

Background

Methods

Results

Conclusion

Similar content being viewed by others

Identification of candidate biomarkers and therapeutic drugs of colorectal cancer by integrated bioinformatics analysis

Identification and verification of HCAR3 and INSL5 as new potential therapeutic targets of colorectal cancer

Gamma-Aminobutyric Acid Type A Receptor Subunit Pi is a potential chemoresistance regulator in colorectal cancer

Introduction

Materials and methods

Data sources and descriptions

Collection of gene expression profiles for exploring drug-target proteins (receptors)

Collection of meta-drug agents for exploring candidate drugs

Collection of independent meta-receptors for cross-validation with the proposed drugs

Integrated statistics and bioinformatics approaches

Identification of DEGs by using LIMMA

Construction of PPI network to identify CRC-causing key genes (KGs)

In-silico validation of CRC-causing KGs

Expression analysis for KGs by GEPIA web-tool with TCGA RNA-seq data

Association of KGs with the immune infiltration levels in different cancers including CRC

DNA methylation of KGs

Association of KGs with different disease

Prognostic power analysis of KGs

Exploring GO and KEGG pathway terms that are associated with DEGs including KGs

Regulatory network analysis of KGs

Molecular docking simulation for exploring candidate drug agents

Results

Identification of cDEGs

Identification of key genes (KGs) from cDEGs

In-silico validation of CRC-causing KGs by using different web-tools and independent databases

Expression analysis for KGs by GEPIA web-tool with TCGA RNA-seq data

Correlation between KGs and immune infiltration levels in different cancers including CRC

DNA methylation of KGs

Association of KGs with different diseases including CRC

Prognostic power analysis

Exploring CRC-causing GO and KEGG pathway terms that are associated with cDEGs including KGs

Regulatory network analysis of KGs

Exploring candidate drug agents by molecular docking simulation

Performance investigation of proposed drugs by cross-validation with the top-ranked independent receptors

Connectivity map (CMap) analysis to discover the mechanism of action of drug agents

Discussion

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Additional file 1. Table S1.

Additional file 2.

Additional file 3.

Additional file 4.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation