Highly perturbed genes and hub genes associated with type 2 diabetes in different tissues of adult humans: a bioinformatics analytic workflow

Type 2 diabetes (T2D) has a complex etiology which is not yet fully elucidated. The identification of gene perturbations and hub genes of T2D may deepen our understanding of its genetic basis. We aimed to identify highly perturbed genes and hub genes associated with T2D via an extensive bioinformatics analytic workflow consisting of five steps: systematic review of Gene Expression Omnibus and associated literature; identification and classification of differentially expressed genes (DEGs); identification of highly perturbed genes via meta-analysis; identification of hub genes via network analysis; and downstream analysis of highly perturbed genes and hub genes. Three meta-analytic strategies, random effects model, vote-counting approach, and p value combining approach, were applied. Hub genes were defined as those nodes having above-average betweenness, closeness, and degree in the network. Downstream analyses included gene ontologies, Kyoto Encyclopedia of Genes and Genomes pathways, metabolomics, COVID-19-related gene sets, and Genotype-Tissue Expression profiles. Analysis of 27 eligible microarrays identified 6284 DEGs (4592 downregulated and 1692 upregulated) in four tissue types. Tissue-specific gene expression was significantly greater than tissue non-specific (shared) gene expression. Analyses revealed 79 highly perturbed genes and 28 hub genes. Downstream analyses identified enrichments of shared genes with certain other diabetes phenotypes; insulin synthesis and action-related pathways and metabolomics; mechanistic associations with apoptosis and immunity-related pathways; COVID-19-related gene sets; and cell types demonstrating over- and under-expression of marker genes of T2D. Our approach provided valuable insights on T2D pathogenesis and pathophysiological manifestations. Broader utility of this pipeline beyond T2D is envisaged. Supplementary Information The online version contains supplementary material available at 10.1007/s10142-022-00881-5.


Introduction
According to an analysis of global data through years 1990-2018, diabetes was prevalent in almost half a billion people, a number expected to rise by 25% and 51%, respectively, by 2030 and 2045 (Saeedi et al. 2019). Type 2 diabetes (T2D) accounts for over 90% of all diabetes cases (Zheng et al. 2018), affecting nearly 6.8% (537 million) of the world population in year 2021 (Sun et al. 2022). Current evidence suggests a complex and multifactorial etiology of T2D characterized by genetic and environmental interactions (Arroyo et al. 2021), although T2D pathogenesis is not yet fully elucidated. There are likely varying degrees of shared genetic origins in the pathogenesis of T2D and other diabetes phenotypes such as type 1 diabetes (T1D) (Aylward et al. 2018), latent autoimmune diabetes in adults (Basile et al. 2014), and maturity-onset diabetes of the young (MODY) (Bonnefond et al. 2020). Recent studies also support the deconstruction of T2D heterogeneity to define T2D sub-types (Udler et al. 2018) and the delineation of a continuum of diabetes sub-types (Flannick et al. 2016) instead of the status quo characterized by a few distinct diabetes phenotypes. Understanding the genetic basis of T2D is fundamental to precision medicine approaches striving for impeccable matching and an individualized level of T2D care (Prasad and Groop 2019).
Cardinal tissues of the body impacted by heightened insulin resistance and diminished insulin secretion in T2D include the pancreas, liver, skeletal muscle, and adipose tissue (Batista et al. 2021). Exploration of gene perturbations in these tissues can deepen our understanding of the molecular etio-pathology of T2D. Highly up-and downregulated genes expressed consistently across different tissue types may uncover potential genome-wide biomarkers or "gene signatures," which are integral to achieving precision diagnostic, prognostic, monitoring, and treatment approaches. Topologically, hub genes are defined as highly and tightly connected nodes in typically scale-free gene regulatory networks (GRN) (Buchberger et al. 2021). As such, criteria such as high correlation in candidate modules (Liu et al. 2019a) and above-average betweenness, closeness, and degree (Liu et al. 2019b) in GRN have been used to demarcate hub genes in previous studies. Functionally, they perform critical regulatory roles in biological processes interacting with many other genes in associated pathways. Given their crucial structural and functional characteristics, hub genes are highly sought-after in precision medicine approaches as plausible niches for developing drug and treatment targets (Liu et al. 2021). In this context, the importance of identifying highly perturbed genes and hub genes associated with T2D for the purpose of individualizing T2D care is unequivocal.
Downstream analyses of gene sets provide invaluable insights into associated core biological functions, pathways, diseases, drugs, and many other aspects. Frequently used gene ontology (GO) analysis provides evidence as a snapshot of contemporary biological knowledge related to a given gene including its function at the molecular level, the cellular location(s) it functions at, and the biological processes reliant on it (Hill et al. 2008). Pathway analyses such as Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa and Goto 2000) entail derivation of coherent and meaningful biological phenomena attributable to input genes (Nguyen et al. 2019). Additional downstream analyses of diseases, drugs, metabolomes, and tissue enrichment are also available and can provide valuable insights into associated genes. Taken together, downstream analysis of highly perturbed and hub genes associated with T2D may render valuable information on aspects such as affected biological processes, dysregulated pathways, related diseases, and metabolomic biomarkers.
Advances in high-throughput technologies have generated a wealth of gene expression data, while the availability of open-source platforms such as the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) (Edgar et al. 2002;Barrett et al. 2013) and simultaneous advent in big data and bioinformatics analytic tools such as microarray and RNA-seq meta-analysis and gene-gene interaction network analysis strategies have offered unprecedented opportunities for high-level evidence synthesis from a multitude of gene expression datasets. Such approaches are likely to render new knowledge on complex diseases like T2D and acquire adequate statistical power to identify genes associated with a disease that may not have been evoked via prior analysis of a single or a few datasets.
Yet, to date, no comprehensive evidence synthesis study has been performed to identify highly perturbed genes and hub genes associated with T2D in human adults using an extensive bioinformatics analytic pipeline. Prior studies were limited to identifying hub genes in a few (n = 3) microarrays from a single tissue type (pancreatic islets) (Lin et al. 2020) or performing an ad hoc gene expression meta-analysis of all diabetes phenotypes (n = 13) (Mei et al. 2017). In this study, we aimed to identify highly perturbed genes and hub genes associated with T2D in different tissues of adult humans via a pre-defined and extensive bioinformatics analytic workflow consisting of systematic review, meta-analysis, identification and classification of differentially expressed genes (DEGs), network analysis, and downstream analysis.

Methods
The methodological approach consisted of five sequential steps: (1) systematic review of NCBI GEO expression data and related publications, (2) analysis of microarrays to identify DEGs, (3) meta-analysis of DEGs to identify highly perturbed genes in T2D, (4) network analysis of gene-gene/protein-protein interactions to identify hub genes in T2D, and (5) downstream analysis of highly perturbed genes and hub genes associated with T2D.

Systematic review
A preliminary search on the NCBI GEO database was first run on 1st February 2021 using a pre-defined search string: "("diabetes mellitus, type 2" [MeSH Terms] OR type 2 diabetes[All Fields]) AND "Homo sapiens" [porgn] AND "Expression profiling by array" [Filter]." The resulting microarrays and related publications were further screened against pre-defined eligibility criteria. Microarrays of conditions other than T2D, all diabetes phenotypes other than T2D, and early dysglycemic conditions such as impaired glucose tolerance, insulin resistance, and prediabetes were excluded. Studies with non-human specimens, children, without healthy controls, or with notable comorbidity in control samples were also excluded. We also omitted studies involving long non-coding RNA (lncRNA), micro RNA (miRNA), samples subject to drug treatments and other interventions, pluripotent stem cells, xenografts, transfected or transgenic tissues, undifferentiated tissues, and sub-samples in super-series. Microarrays passing these eligibility criteria were selected for manual curation which were then further screened along with the full texts of related publications (where available) for the presence of adequate information such as clinical diagnosis (healthy vs T2D) and gene symbol/Entrez ID. Following this, all microarrays with sufficient information were selected for subsequent analyses.

Identification of DEGs
All microarrays selected from the systematic review were imported to R using getGEO function of GEOquery package (Davis and Meltzer 2007). In each expression set, pheno-Data component was examined to determine the number of eligible samples and confirm the presence of outcome (T2D vs controls) variable, while featureData component was explored to verify the presence of gene annotation information. Where required, non-normalized gene expression matrices were log 2 transformed in order to alleviate a, differentially expressed genes were defined as BH-adjusted p value < 0.05 and log 2 FC > 1|log 2 FC < − 0.5; b, upregulated genes were defined as BH-adjusted p value < 0.05 and log 2 FC > 1; c, downregulated genes were defined as BH-adjusted p value < 0.05 and log 2 FC < − 0.5 DEGs, differentially expressed genes; PBMCs, peripheral blood mononuclear cells  (Le et al. 2020). Samples were assessed for the presence of any batch effects between the two groups by running principal components analysis on transposed expression matrices and were rectified using removeBatchEffect function in limma package (Ritchie et al. 2015). Relevant features were annotated with expression matrices to generate curated data for running differential gene expression analysis. Samples were ascribed to the relevant group (T2D/control) using model. matrix function of limma package (Ritchie et al. 2015) producing a binary design matrix. As the detection of DEGs can be enhanced by filtering genes with a low expression level, we assumed a median (50%) cut-off for the gene expression level. A uniform analytic pipeline consisting of the following sequential steps was applied to each microarray: (1) median expression levels were calculated, and those above the median were retained.
(2) From the resulting genes, those expressed in more than two samples were retained, while the others were removed. (3) Model fitting was performed using lmFit function of limma (Ritchie et al. 2015) to enumerate expression levels of T2D and control groups.
(4) Contrasts were defined as "T2D, control," and empirical Bayes step was run to derive differential expression results.
(5) The DEGs, defined as those with log 2 FC > 1 and Benjamini-Hochberg (BH)-adjusted p < 0.05 for upregulated genes and log 2 FC < − 0.5 and BH-adjusted p < 0.05 for downregulated genes, were identified for each microarray. Microarrays with no DEGs were excluded, and those with non-zero DEGs were visualized with a clustered bar chart. Furthermore, DEGs were classified by tissue types and visualized as a Venn diagram. Information on clinical and other features of the microarrays with non-zero DEGs, individual DEGs identified by each dataset, and tissue-based classification of DEGs were also summarized.

Meta-analysis of DEGs: identification of highly perturbed genes
In order to identify highly perturbed genes associated with T2D, we conducted meta-analysis of DEGs using Meta-VolcanoR package (Prada et al. 2020). We implemented all 3 meta-analytic strategies incorporated in this package, namely, random effects model (REM), vote-counting approach (VC), and p value combining approach (CA).
In brief, REM synthesizes a summary fold-change of multiple microarrays based on variance, producing a summary p value which indicates the probability that the summary foldchange is not different to zero. The metathr parameter can be specified to filter the desired percentage of the top-most consistently perturbed genes. Gene perturbation is ranked as per the topconfects approach (Harrison et al. 2019). The VC algorithm produces highly perturbed genes according to user-specified p values and fold-change cut-off levels, taking into account both the number of studies in which a DEG appeared and its gene fold-change sign consistency. Here also, metathr parameter can be defined to extract the required percentage of highly perturbed genes. Meta-synthesis of gene perturbation by CA algorithm is at the mean or median level along with p values derived by Fisher method. A required proportion of top-most DEGs can be identified by specifying metathr parameter with CA as well (Prada et al. 2020).
As required by the package, all microarray datasets with non-zero DEGs; each consisting of the columns gene name (symbol), fold-change (log 2 FC), and p value; and confidence intervals of the fold-change (CI.L and CI.R) were merged to build a list item. For all 3 meta-analytic models, metathr was set at 0.01. For VC, p value and absolute fold-change thresholds were set at 0.05 and 0, respectively.
Highly perturbed genes identified by each model as well as the compiled list of all highly perturbed genes were presented in tabular format. Volcano plots were drawn to illustrate the top-most perturbed genes identified by REM, VC, and CA methods. Inverse cumulative distribution of consistently differentially expressed genes as per VC was plotted to demonstrate the number of genes with perturbed expression in ≥ 1 studies. Detailed meta-analytic outputs

Network analysis: identification of hub genes
The list of highly perturbed genes was fed into GENEMA-NIA (Warde-Farley et al. 2010) to determine the gene-gene interaction network. The interaction network formulated by the GENEMANIA gene function prediction program, based on the multiple association network integration algorithm (MANIA), incorporates a multitude of functional associations including co-expression, pathways, physical interactions, co-localization, genetic interactions, and protein domain similarity. It has been found more accurate and computationally efficient than other gene function prediction methods (Mostafavi et al. 2008;Peña-Castillo et al. 2008). The gene-gene interaction network was first constructed and visualized in GENEMANIA. Next, these interactions were imported to visualize the protein-protein interaction (PPI) network in STRING version 11.0 (Szklarczyk et al. 2019). We used the Centiscape application (Scardoni et al. 2009) in the Cytoscape software (Shannon et al. 2003) to analyze the PPI network and determine the hub nodes. After removing nodes with a connection number < 2, the network was visualized in Cytoscape. Hub genes were defined as those nodes in the network with betweenness, closeness, and degree higher than their mean values. A similar approach has been previously used to demarcate hub genes (Liu et al. 2019b). Topological features of the hub nodes and details of the PPI network derived by Centiscape were summarized.

Downstream analysis of highly perturbed genes and hub genes
Using Enrichr (Chen et al. 2013;Kuleshov et al. 2016;Xie et al. 2021) platform, we ran a series of downstream analyses for both highly perturbed genes and hub genes as outlined below:  The Genotype-Tissue Expression (GTEx) portal contains tissue-specific gene expression and regulation data (GTEx Consortium 2013), whereas the Human Metabolome Database (HMDB) records human metabolomics data (Wishart et al. 2018).

Results
The bioinformatic analytic workflow is summarized in Online Resource 1.

Systematic review outputs
The preliminary search resulted in 178 eligible microarrays, while 45 of these were selected for manual curation. We selected 27 microarrays with sufficient information for subsequent analyses, the details of which are provided in Online Resource 2.

Differential gene expression analysis outputs
The number of DEGs identified by each dataset is shown in Table 1. There were 11 microarrays with no DEGs. In Fig. 1, microarrays with non-zero DEGs (n = 16) are visualized as a clustered bar chart. The identified DEGs belonged to four tissue types (i.e., circulatory; adipose; digestive; skeletal muscle) as visualized in the Venn diagram ( Fig. 2). Clinical and other information of the 16 microarrays with non-zero DEGs is presented in Online Resource 3. Details of DEGs identified by each dataset are presented in Online Resource 4, while DEGs classified by tissue type are presented in Online Resource 5.

A significantly larger proportion of genes associated with T2D is downregulated
Of all the DEGs identified by different tissues (n = 6284), the proportion of downregulated genes (n = 4592) was significantly higher than the proportion of upregulated genes (n = 1692) (p < 0.00000001). At the level of tissue type, circulatory (255 downregulated vs 94 upregulated; p < 0.000001), adipose (1426 downregulated vs 1212 upregulated; p = 0.000017), and digestive (2664 downregulated vs 135 upregulated; p < 0.000001) tissues revealed a similar pattern, while the skeletal muscle (247 downregulated vs

Tissue-specific DEGs are predominant compared to DEGs shared between tissues in T2D
Proportions of tissue-specific DEGs were significantly higher than the proportions shared with one or more other tissue types in circulatory (242 specific vs 89 non-specific; p < 0.000001), adipose (1989 specific vs 342 non-specific; p < 0.000001), and digestive (1875 specific vs 409 non-specific; p < 0.000001) groups, while skeletal muscle (244 specific vs 217 non-specific; p = 0.112959) had no such over-expression (Fig. 2).

Meta-analytic outputs
As shown in Table 2, the three meta-analytic algorithms REM, VC, and CA identified 49, 27, and 8 highly perturbed genes, respectively. The compiled list, after removing redundancies, comprised 79 highly perturbed genes. Volcano plots illustrating the top-most perturbed genes identified by REM, VC, and CA are given in Figs. 3, 4, 5, respectively. Figure 6 presents the inverse cumulative distribution of consistently differentially expressed genes as per VC, plotted to demonstrate the number of genes with perturbed expression in ≥ 1 studies. We present detailed meta-analytic outputs from all 3 approaches in Online Resource 6 and highly perturbed genes identified by each method in Online Resource 7.

Highly perturbed genes associated with T2D comprise both up-and downregulated genes
There was no significant difference (p = 0.410983) between the proportions of up-(38/79) and downregulated (41/79) genes that constituted the highly perturbed gene set (n = 79) identified by all 3 meta-analytic algorithms (

Network analysis outputs
Network analysis identified 28 hub genes, the topological features of which are summarized in Table 3. Figure 7 presents the gene-gene interaction network produced on GENE-MANIA, and details of this network are given in Online Resource 8. The PPI network visualized on STRING version 11.0 (Szklarczyk et al. 2019) is given in Fig. 8. The network created by Cytoscape is provided in Fig. 9. Details of the PPI network derived by Centiscape are provided in Online Resource 9.

Downstream functional analysis outputs
Findings from downstream analyses of highly perturbed genes are summarized in Tables 4, 5, 6 and illustrated in Figs. 10, 11, 12 with details in Online Resource 10. Results from downstream analyses of hub genes are presented in Tables 7,8,9 and visualized in Figs. 13,14,15 with details in Online Resource 11.

Enriched KEGG pathways associated with highly perturbed genes and hub genes of T2D
As per KEGG analysis of highly perturbed genes, 21 pathways were enriched (Online Resource 10). The top 10 pathways were Epstein-Barr virus infection; antigen processing and presentation; autoimmune thyroid disease; maturityonset diabetes of the young; asthma; allograft rejection; graft-versus-host disease; type 1 diabetes mellitus; intestinal immune network for IgA production; and cell adhesion molecules (CAMs). Seven of these (autoimmune thyroid disease; maturity-onset diabetes of the young; asthma; allograft rejection; graft-versus-host disease; type 1 diabetes mellitus; intestinal immune network for IgA production) were associated with downregulated genes in T2D (Table 4).
According to KEGG analysis of hub genes, 12 pathways were enriched (Online Resource 11). The top 10 pathways were insulin secretion; apoptosis; adrenergic signaling in cardiomyocytes; cell adhesion molecules (CAMs); JAK-STAT signaling pathway; transcriptional mis-regulation in cancer; arginine biosynthesis; maturity-onset diabetes of the young; ascorbate and aldarate metabolism; and fatty acid elongation. Four of these (insulin secretion; maturityonset diabetes of the young; ascorbate and aldarate metabolism; fatty acid elongation) were associated with downregulated genes in T2D, while two pathways (JAK-STAT signaling pathway and transcriptional misregulation in cancer) were associated with upregulated genes in T2D (Table 6).

COVID-19 related gene sets associated with highly perturbed genes and hub genes of T2D
Downstream analyses revealed 20 COVID-19-related gene sets associated with the highly perturbed genes of T2D (Online Resource 10), and the top 10 of these are visualized in Fig. 11a. There were 23 COVID-19-related gene sets associated with the hub genes of T2D (Online Resource 11), the top 10 of which are visualized in Fig. 14a.

Gene expression in different cell types associated with highly perturbed genes and hub genes of T2D
There were 168 downregulated GTEx profiles (Online Resource 10) associated with highly perturbed genes of T2D, the top 10 of which consisted of one brain, two thyroids, and seven blood tissue samples (Table 6 and Fig. 12). There were 77 upregulated GTEx profiles (Online Resource 10) associated with highly perturbed genes of T2D, the top 10 of which consisted of six adipose and one each of lung, breast, heart, and blood vessel tissue samples (Table 6 and Fig. 12).
There were 223 downregulated GTEx profiles (Online Resource 11) associated with hub genes of T2D, the top 10 of which consisted of five nervous system (three brain, one nerve, one pituitary), three blood, one breast, and one bladder tissue samples (Table 9 and Fig. 15). There were 178 upregulated GTEx profiles (Online Resource 11) associated with hub genes of T2D, the top 10 of which consisted of six adipose, two blood, and two skin tissue samples (Table 9 and Fig. 15).

Discussion
In this study, we identified highly perturbed genes and hub genes associated with T2D in different tissues of adult humans, via an extensive bioinformatics analytic workflow. Downstream analyses revealed valuable insights on T2D pathogenesis, including associations with other diabetes phenotypes and COVID-19 and patterns of tissue-specific and tissue non-specific differential gene expression as well as pathophysiological manifestations such as those related to insulin action, immunity, and apoptosis. Salient findings of the study which contribute towards the understanding of the genetic basis of T2D are further discussed below. The comprehensive evidence synthesis approach with opensource gene expression data exemplified in this study could be replicated to gain high-level evidence synthesis for other clinical conditions.

Patterns of differential gene expression in T2D
Our findings indicate that T2D seems rather a disorder of gene downregulation than upregulation, when the whole genome is considered. This is consistent with previous studies where a preponderance of gene downregulation was associated with T2D (Takematsu et al. 2020;Palsgaard et al. 2009). Also,  (Meugnier et al. 2007). A similar pattern has been observed in T1D (Yip et al. 2020) and other endocrine disorders such as polycystic ovary syndrome (Idicula- Thomas et al. 2020). In contrast, highly perturbed genes and hub genes associated with T2D, which might together constitute the candidate gene set critical for pathogenicity of T2D, were found to contain both up-and downregulated genes. This presentation suggests a more complex dysregulation at the crux of the GRN of T2D, involving actions and interactions between both repressed and augmented genes.

Tissue-specific and tissue non-specific DEGs associated with T2D
Results of the present study indicate the predominance of tissue-specific DEGs in T2D. This supports the use of target  tissue gene expression analysis as a viable avenue for identifying tissue-specific T2D biomarkers. A previous analysis integrating multiple tissue transcriptomics and PPI data to explore molecular biomarkers of T2D confirmed the presence of common signatures (Calimlioglu et al. 2015). We also observed common DEGs across different tissue types which can act as confluent molecular signatures of T2D. Identification of tissue-specific and non-specific molecular gene signatures of T2D facilitates downstream exploration of key pathways amenable to therapeutic targeting and drug repurposing efforts.

Shared gene enrichment across diabetes phenotypes
As revealed by KEGG pathway analyses, both MODY and T1D were enriched pathways associated with highly perturbed genes of T2D, while MODY was also an enriched Downregulation of SLC2A2 is associated with not only T2D (Solimena et al. 2018) but also neonatal diabetes (Sansbury et al. 2012) and early childhood diabetes (Alhaidan et al. 2020), suggesting a likely role in insulin secretion. Amylin (IAPP), a gluco-modulatory hormone co-expressed with insulin by pancreatic β cells, is downregulated in both T1D and advanced T2D (Abedini et al. 2013), while amylin agonists are considered novel therapeutic agents for treating diabetes (Sonne et al. 2021). Moreover, human amylin plays a protective role against autoimmune diabetes inducing CD4 + Foxp3 + regulatory T cells (Zhang et al. 2018). Downregulation of HLA-DRB4 in peripheral blood mononuclear cells is associated with T2D as well as dyslipidemia and periodontitis (Corbi et al. 2020), while a meta-analysis revealed that the lack of HLA-DRB5 increased T2D risk (Jacobi et al. 2020). Of note, both HLA-DRB4 and HLA-DRB5 are associated with β cell autoantibodies and T1D (Zhao et al. 2016), with previous studies reporting that both T1D and T2D share HLA class II locus components (Jacobi et al. 2020). Interestingly, two of the hub genes of T2D found in our study (MMP9, ARG1) have been found as hub genes of T1D in a previous analysis (Yang et al. 2020). Collectively, these findings support some degree of shared genetic architecture between T2D and other diabetes pathologies.

T2D as a disorder of insulin secretion and action
Downstream analyses provided insights into the characterization of T2D as a disorder of insulin secretion and action. We found that insulin secretion was the most significant KEGG pathway associated with hub genes, whereby two downregulated hub genes (SLC2A2, SNAP25) in T2D were underlying this enrichment. Zinc, the most significant HMDB metabolite associated with both highly perturbed genes and hub genes of T2D, is an essential element with key regulatory roles in insulin synthesis, storage, and secretion . Other metabolomic markers associated with highly perturbed genes included magnesium which is necessary for insulin signaling (Piuri et al. 2021) as well as manganese which is involved in insulin synthesis and secretion ). Together, these findings underscore the effects on insulin production and action as pivotal to T2D pathogenesis.

Pathophysiological manifestations of T2D
Apoptosis Downstream analyses revealed that multiple GO and KEGG pathways associated with apoptosis, including intrinsic apoptotic signaling pathway, were enriched in T2D. It is known that hyperglycemia-induced β cell apoptosis, a hallmark in T2D progression, occurs via intrinsic pathways causing reduced islet mass and metabolic abnormalities (Wali et al. 2013). Hyperglycemia-induced apoptosis has been reported to occur in other sites such as renal cells (Jung et al. 2012) and coronary arteries (Kageyama et al. 2011), indicating a possible role in disease progression and the onset of complications.   -Immunity Downstream analyses also revealed that multiple immunity-related GO and KEGG pathways, encompassing both innate and humoral immune responses, were enriched in T2D. These included multiple ontologies involving neutrophils and antigen processing and presentation as well as severe immune reactions such as graft-versus-host disease and allograft rejection. Impaired immunity in T2D and consequent susceptibility to infections and complications is frequently observed (Berbudi et al. 2020). A deeper understanding of the genomics underlying impaired immunity in T2D might provide opportunities to personalize the management of comorbidities and pharmacotherapy.

COVID-19 and T2D
Epidemiological studies strongly suggest poorer prognosis of COVID-19 among people with T2D (Selvin and Juraschek, 2020), although underlying mechanisms are not wellunderstood (Apicella et al. 2020). Downstream analysis of highly perturbed genes and hub genes of T2D in the present study revealed a large number of enriched COVID-19-related gene sets, providing support for this putative link at a more granular level.

Over-and under-expression of genes in different tissues associated with T2D
Downstream analysis of GTEx profiles identified tissues that are likely to demonstrate under-and over-expression of DEGs associated with T2D. Findings indicate that adipose tissue tends to over-express marker genes of T2D, while these might be under-expressed in other tissues such as those of the nervous system. These findings have implications for biomarker discovery and can guide further research on tissues which should be explored for identifying DEGs.

Conclusions
Findings of this study contribute towards the understanding of the genetic basis of T2D, and further research is warranted to substantiate the molecular mechanisms underlying these findings which is fundamental to establishing precision T2D medicine initiatives. The proposed bioinformatics pipeline may have broader use as a judicious strategy to identify gene perturbations and pathophysiological mechanisms of other clinical conditions beyond T2D which ought to be validated in future research. Finally, this study describes an exemplary approach to applying comprehensive evidence synthesis using existing open-source gene expression data. Other researchers are encouraged to apply this methodology to obtain high-level evidence from existing multiple datasets, thereby getting the most value from existing bioinformatics sources.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.