Introduction

Type 1 diabetes (T1D) (insulin-dependent) is a core challenge for endocrine research around the world [1]. Approximately 5 to 10% of the childhood population is affected with T1D worldwide [2]. T1D affects the eyes, kidneys, heart, peripheral and autonomic nervous systems [3]. Pancreatic cells, particularly β-cells, play a key role in the occurrence and progression of T1D [4]. Treatment for T1D includes targeting β-cells and β-cells regeneration [5]. However, T1D is a complex disease and its biology remains poorly understood [6].

There are several important risk factors for T1D, such as genetic and environmental factors [7, 8]. Previous studies identified aspects of the molecular mechanism of T1D advancement. T1D has been genetically associated with genes and signaling pathways, CTLA-4 [9], SUMO4 [10], CYP27B1 [11], PD-1 [12], KIAA0350 [13], tumor necrosis factor alpha signaling pathways [14], NLRP3 and NLRP1 inflammasomes signaling pathways [15], HIF-1/VEGF signaling pathway [16], l-arginine/NO pathway [17], and CaMKII/NF-κB/TGF-β1 and PPAR-γ signaling pathway [18]. Next-generation sequencing (NGS) has drastically increased the understanding mechanism of T1D, and analyses of these data can provide insight into effective diagnostic and therapeutic T1D treatments [19]. Thus, identifying key molecular biomarkers is essential for early diagnosis, prevention, and treatment of T1D.

It worth a lot of money and time to identify disease related molecular biomarkers by experiment alone. With the wide application of expression profiling by high throughput sequencing data, there were huge genomics data deposited in public databases [20]. The progression of computational tools gives us an alternative method to diagnose novel molecular biomarkers.

In this investigation, we employed the bioinformatics approach to discover the differentially expressed genes between T1D patients and healthy donors. Original expression profiling by high throughput sequencing dataset GSE123658 was downloaded. 39 T1D patients’ samples and 43 healthy donors’ samples were analyzed in our investigation. Commonly altered DEGs were isolated from integrated data. Additionally, GO/ REACTOME pathway analysis, construction of protein–protein interaction network, modules, target gene - miRNA regulatory network and target gene - TF regulatory network analysis were performed to analyze these data. Four hub genes (EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN) were identified. ROC (receiver operating characteristic) curve and RT-PCR analysis were used to verify clinically relevant hub genes. The aim of this investigation was to gain a better understanding of the underlying molecular mechanisms and to discover molecular biomarkers for T1D.

Material and methods

Data resources

Expression profiling by high throughput sequencing dataset GSE123658 was downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo/) [21]. CPM count normalization performed on the original dataset GSE123658 from GEO databse using package edgeR package [22], voom function [23], and Limma [24] of R software. The data was produced using a GPL18573 Illumina NextSeq 500 (Homo sapiens). The GSE123658 dataset contained data from 82 samples, including 39 T1D patients’ samples and 43 healthy donors’ samples.

Identification of DEGs

The identification of DEGs between 39 T1D patients’ samples and 43 healthy donors’ samples was performed using Limma package in R bioconductor. lmFit function in the limma package to construct linear model for individual gene [25]. makeContrasts function in the limma package to compose similarity between T1D and healthy donors groups (log fold-changes) are obtained as contrasts of these fitted linear model. eBayes is a function in limma package which figure out empirical Bayes predicts of DEGs [26]. topTable function in limma package to obtain a table of the most significant Up and down regulated genes from a eBayes model fit. To correct the discovery of statistically important molecular biomarkers and limitations of false-positives, we using the adjusted P-value and Benjamini and Hochberg false discovery rate method [27]. Fold-change (FC) and adjust p-values were used to found DEGs. A |log2FC|  > 0.94 for up regulated genes, |log2FC|  -0.39 for down regulated genes and P-value  <  0.05 were used as considered statistically significant. The volcano plot was implemented using ggplot2 package [28], and the heat map was established using gplots package in R language.

Gene Ontology (GO) and pathway enrichment analyses of DEGs

Gene Ontology (GO) (http://www.geneontology.org) analysis is a routine analysis for annotating genes and determining biological component, including biological process (BP), cellular component (CC) and molecular function (MF) [29]. REACTOME (https://reactome.org/) [30] pathway database is applied for classification by correlating gene sets into their respective pathways. The ToppGene (ToppFun) (https://toppgene.cchmc.org/enrichment.jsp) [31] is a gene functional classification tool that objective to provide a extensive set of functional annotation tools for authors to recognize the biological explanation behind large lists of genes. P < 0.05 was find statistically significant.

PPI network construction and module analysis

The online Human Integrated Protein-Protein Interaction rEference (HIPPIE) (http://cbdm.uni-mainz.de/hippie/) [32] online database was using to predicted the PPI network information. Analyzing the interactions and functions between DEGs may provide information about the mechanisms of generation and development of disease (PPI score  >  0.4). Cytoscape 3.8.0 (http://www.cytoscape.org/) [33] is a bioinformatics platform for constructing and visualizing molecular interaction networks. The Network Analyzer Cytoscape plug-in was used to find hub genes were screened using highest node degree [34], betweenness centrality [35], stress centrality [36] and closeness centrality [37] methods and hub genes were further analyzed for pathway and GO enrichment analysis. The plug-in PEWCC1 (http://apps.cytoscape.org/apps/PEWCC1) [38] of Cytoscape was applied to detect densely connected regions in PPI networks. The PPI networks were constructed using Cytoscape and the most key module in the PPI networks was preferred using PEWCC1. The criteria for selection were set as follows: Max depth = 100, degree cut-off = 2, Node score cut-off = 0.2, PEWCC1 scores >5, and K-score = 2.

Construction of miRNA - target regulatory network

miRNet database (https://www.mirnet.ca/) [39] online database was used to predict miRNAs that targeted the DEGs associated with T1D. The DEGs were selected according to the screening criterion of P value <0.05. The results were exported into the Cytoscape software for analysis. The target genes - miRNA regulatory network was constructed through network topology prosperities. The node degree was determined using the Network analysis plugin, and miRNAs with a node degree >12.

Construction of TF - target regulatory network

NetworkAnalyst database (https://www.networkanalyst.ca/) [40] online database was used to predict TFs that targeted the DEGs associated with T1D. The DEGs were selected according to the screening criterion of P value <0.05. The results were exported into the Cytoscape software for analysis. The target genes - TF regulatory network was constructed through network topology prosperities. The node degree was determined using the Network analysis plugin, and TFs with a node degree >12.

Validation of hub genes

Receiver operating characteristic (ROC) curve analysis [41] was implemented to calculate the sensitivity and specificity of the DEGs for T1D diagnosis using the R package by using the generalized linear model (GLM) in machine learning algorithms [42]. An area under the curve (AUC) value was determined and used to label the ROC effect. The RIN-m5F (ATCC CRL-11605) cell line procured from ATCC for T1D. RINm5F cell line was grown in RPMI-1640 medium added with 10% fetal bovine serum and penicillin/streptomycin. Incubate this cell line at 37 °C in a 5% CO2 in humidified cell culture incubator. The HIT­T15 (ATCC CRL­1777) cell line procured from ATCC for normal. CRL-1777 cell line was grown in Ham’s F12K medium added with 10% fetal bovine serum,87.5%; dialyzedhorse serum, 2 mM L­glutamine, 1.5 g/L sodium bicarbonate, and penicillin/streptomycin. Incubate the this cell line at 37 °C in a 5% CO2 in humidified cell culture incubator. In RT-PCR analysis, total RNA was isolated from in vitro cultured cells of diabetics and normal using a TRI Reagent (Sigma, USA). cDNA was synthesized using 2.0 μg of total RNA with the FastQuant RT kit (with gDNase; Tiangen Biotech Co., Ltd.). The relative mRNA expression was measured in the QuantStudio 7 Flex real-time PCR system (Thermo Fisher Scientific, Waltham, MA, USA). The relative expression levels were determined by the 2-ΔΔCt method and normalized to internal control β-actin [42]. All RT-PCR reactions were performed in triplicate. The primers used to explore mRNA expression of ten hub genes were shown in Table 1.

Table 1 The sequences of primers for quantitative RT-PCR

Molecular Docking studies

The surflex-docking studies of the designed molecule, standard on over expressed genes of PDB protein were performed using SYBYL-X 2.0 perpetual software. All the designed molecules and standard were outlined using ChemDraw Software, imported and saved in sdf. templet using open babel free software. The protein structures of EGFR (epidermal growth factor receptor) and its co-crystallised protein of PDB code 2XYJ, 2XYX and GRIN2B its co-crystallised protein of PDB code and its co-crystallised protein of PDB code 5EWL, 6E7V were extracted from Protein Data Bank [43,44,45,46]. Together with the TRIPOS force field, GasteigerHuckel (GH) charges were added to all designed molecules and standard for the structure optimization process. Additionally, using MMFF94s and MMFF94 algorithm methods, energy minimization was done. The protein preparation was carried out after incorporation of protein. The co-crystallized ligand and all water molecules were removed from the crystal structure; more hydrogen’s were added and the side chain was set. TRIPOS force field was used for the minimization of structure. The compounds’ interaction efficiency with the receptor was represented by the Surflex-Dock score in kcal / mol units. The interaction between the protein and the ligand, the best pose was incorporated into the molecular area. The visualization of ligand interaction with receptor is done by using discovery studio visualizer.

Results

Identification of DEGs

With a threshold of P-value <0.05 and absolute value of |log2FC|  > 0.94 for up regulated genes, |log2FC|  -0.39 for down regulated genes, DEGs were identified from dataset. 284 DEGs were screened from the GSE123658 dataset, which consisted of 142 up regulated genes and 142 down regulated genes and only top ten up regulated genes and down regulated genes are listed in Table 2. Volcano plots were used to visualize differential expression of genes between the T1D group and healthy donors group (Fig.1). A heat map showed expression profiling of DEGs in the analysis result (Fig.2).

Table 2 The statistical metrics for key differentially expressed genes (DEGs)
Fig. 1
figure 1

Volcano plot of differentially expressed genes. Genes with a significant change of more than two-fold were selected. Green dot represented up regulated significant genes and red dot represented down regulated significant genes

Fig. 2
figure 2

Heat map of differentially expressed genes. Legend on the top left indicate log fold change of genes. (A1 – A43 = healthy donors’ samples; B1 – B14 = T1D samples)

Gene Ontology (GO) and pathway enrichment analyses of DEGs

To identify the pathways which had the most significant involvement with the genes identified, up regulated and down regulated genes are listed in Table 3 and Table 4. DEGs were submitted into ToppGene for GO and REACTOME pathway enrichment analysis. GO enrichment analysis revealed that in BP terms, the up regulated genes were mainly enriched in cell-cell signaling and reproductive process, whereas down regulated genes were mainly enriched in vesicle fusion and biological adhesion. In CC terms, up regulated genes were mainly enriched in integral component of plasma membrane and supramolecularfiber, whereas down regulated genes were mainly enriched in whole membrane and cell junction. In MF terms, up regulated genes were mainly enriched in signaling receptor activity and structural molecule activity, whereas down regulated genes were mainly enriched in lipid binding and GTPase binding. REACTOME pathway analysis demonstrated that up regulated genes were significantly enriched in signaling by GPCR and muscle contraction, whereas down regulated genes were mainly enriched in innate immune system and cytokine signaling in immune system.

Table 3 The enriched pathway terms of the up and down regulated differentially expressed genes
Table 4 The enriched GO terms of the up and down regulated differentially expressed genes

PPI network construction and module analysis

Interactions between the identified up and down regulated genes were reported by constructing a PPI network. In total, there were 5017 nodes and 8133 edges in the network (Fig.3a). According to node degree, betweenness centrality, stress centrality and closeness centrality levels, the top four hub nodes were: EGFR (degree, 1343; betweenness, 0.514227; stress, 4.35E+08; closeness, 0.441893), GRIN2B (degree, 195; betweenness, 0.008701; stress, 14,535,992; closeness, GJA1 (degree,170 0.03151; betweenness, 64,939,702; stress, 0.31364; closeness, 0.282164), CAP2 (degree, 108; betweenness, 0.004218; stress, 888,128; closeness, 0.325945), MIF (degree, 99; betweenness, 0.025028; stress, 7,627,912; closeness, 0.362124), POLR2A (degree, 512; betweenness, 0.095104; stress, 2E+08; closeness, 0.3738), PRKACA (degree, 350; betweenness, 0.104658; stress, 44,280,174; closeness, 0.378052), GABARAP (degree, 322; betweenness, 0.144708; stress, 36,234,824; closeness, 0.358401), TLN1 (degree, 223; betweenness, 0.027406; stress, 67,578,124; closeness, 0.35998), and PXN (degree, 173; betweenness, 0.029203; stress, 29,047,866; closeness,0.360318) and are listed in Table 5. Significant modules were subsequently constructed with 12 nodes and 23 edges for up regulated genes (Fig.3b) and 5 nodes and 10 edges for down regulated genes, which gained the highest PEWCC1 score (Fig.3c). Subsequent functional enrichment analysis revealed that the genes in these modules were mainly enriched in cell-cell signaling and cytokine signaling in immune system.

Fig. 3
figure 3

PPI network and the most significant modules of DEGs. a The PPI network of DEGs was constructed using Cytoscape. b The most significant module was obtained from PPI network with 12 nodes and 23 edges for up regulated genes. c The most significant module was obtained from PPI network with 5 nodes and 10 edges for down regulated genes. Up regulated genes are marked in green; Down regulated genes are marked in red

Table 5 Topology table for up and down regulated genes

Prediction of key miRNAs

The regulatory relationships between the target genes and their miRNAs were established using Cytoscape, which showed that the single gene was regulated by multiple miRNAs is shown in Fig.4a. Subsequently, 106 miRNAs (ex, hsa-mir-4257) targeting GRIN2B, 83 miRNAs (ex, hsa-mir-564) targeting EGFR, 66 miRNAs (ex, hsa-mir-587) targeting DKK1, 64 miRNAs (ex, hsa-mir-941) targeting GJA1, 54 miRNAs (ex, hsa-mir-561-3p) targeting RGS4, 190 miRNAs (ex, hsa-mir-4300) targeting TLN1, 139 miRNAs (ex, hsa-mir-5694) targeting IGF2R, 117 miRNAs (ex, hsa-mir-378b) targeting POLR2A, 113 miRNAs (ex, hsa-mir-3918) targeting ARHGAP1 and 102 miRNAs (ex, hsa-mir-6719-3p) targeting HIP1, and were verified in miRNet database are listed in Table 6. Integrating with the result of REACTOME pathway analysis, it was indicated that these key target genes - miRNA network was mainly involved in the signaling by GPCR and innate immune system.

Fig. 4
figure 4

a Target gene - miRNA regulatory network between target genes and miRNAs. b Target gene - TF regulatory network between target genes and TFs. Up regulated genes are marked in green; down regulated genes are marked in red; The chocolate color triangle nodes represent the key miRNAs; the blue color triangle nodes represent the key miRNAs

Table 6 miRNA - target gene and TF - target gene interaction

Prediction of key TFs

The regulatory relationships between the target genes and their TFs were established using Cytoscape, which showed that the single gene was regulated by multiple TFs is shown in Fig.4b. Subsequently, 19 TFs (ex, FOXD1) targeting RGS4, 16 TFs (ex, GATA2) targeting EYA1, 12 TFs (ex, FOXL1) targeting GRIA4, 11 TFs (ex, TP53) targeting CCL19, 11 TFs (ex, JUND) targeting PRL, 15 TFs (ex, STAT3) targeting PRKACA, 14 TFs (ex, TFAP2A) targeting GAB2 , 12 TFs (ex, KLF5) targeting HIP1, 10 TFs (ex, PPARG) targeting PXN and 9 TFs (ex, HINFP) targeting RGL2 were verified in NetworkAnalyst database are listed in Table 6. Integrating with the result of REACTOME pathway analysis, it was indicated that these key target genes - TF network was mainly involved in the signaling by GPCR and innate immune system.

Validation of hub genes

As these 4 genes are remarkably expressed in T1D, we executed a ROC curve analysis to calculate their sensitivity and specificity for the diagnosis of T1D. As shown in Fig. 5 EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN achieved an AUC value of >0.982, demonstrating that these genes have high sensitivity and specificity for T1D diagnosis. We further used RT-PCR to detect the mRNA expression of the hub gene. The 10 hub genes contain two up regulated genes (EGFR, GRIN2B, GJA1, CAP2 and MIF) and two down regulated gene (POLR2A, PRKACA, GABARAP, TLN1 and PXN). The RT-PCR data showed that although the trend of expression patterns of these 10 hub genes were consistent, among these up regulated genes, only EGFR, GRIN2B, GJA1, CAP2 and MIF were significantly up regulated in T1D. In addition, the expression of POLR2A, PRKACA, GABARAP, TLN1 and PXN were reduced in T1D (Fig. 6).

Fig. 5
figure 5

ROC curve analyses of hub genes. a EGFR. b GRIN2B. c GJA1. d CAP2. e MIF. f POLR2A. g PRKACA. h GABARAP. i TLN1. j PXN

Fig. 6
figure 6

RT-PCR analyses of hub genes. a EGFR. b GRIN2B. c GJA1. d CAP2. e MIF. f POLR2A. g PRKACA. h GABARAP. i TLN1. j PXN

Molecular Docking studies

The docking simulations are conducted in the present study is to identify the active site conformation and major interactions responsible for complex stability with the receptor binding sites. In T1D over expression of genes are identified and their proteins of x-ray crystallographic structure are selected from PDB for docking studies. The drugs containing thiazolidindione ring pioglitazone are most commonly used either alone or in combination with other antidibetic drugs. The docking studies of pioglitazone (as standard) and designed molecules containing the heterocyclic ring of thiazolidinedione have been carried out using Sybyl X 2.1 drug design software. The docking studies were performed to know the biding interaction of pioglitazone standard and designed molecules on identified over expressed genes of protein. The X- RAY crystallographic structure of two proteins from each over expressedgenes of epidermal growth factor receptor (EGFR), cyclase associated actin aytoskeleton aegulatory arotein 2 (CAP2), glutamate inotropic receptor NMDA type subunit 2B (GRIN2B), gap junction protein plpha 1 (GJA1) and macrophage migration inhibitory factor (MIF) and one protein from each of their co-crystallised PDB code 2XYJ, 4 K41, 5EWL, 2ZW3 and 4WR8 respectively were selected for the docking studies to identify and predict the potential molecule based on the binding score with the protein and effective against type 1 diabetes mellitus. A total of 54 molecules were designed and the binding score greater than six are believed to be good, few molecules obtained excellent binding score (C-score) with particular protein greater than 10The designed molecules obtained binding or score c- score less than 5 are TSPZP19, TBPZ38, TBPZ41 and TSIO4, TSPZ12, TBIO32, TBPZ40 TBPZ41 TBPZ42 TBPZ43 TBPZ44 TBPZ45 TBPZP46 and TSIO2, TSIO3, TSIO5, TSPZP20, TSPZP23, TBIO32, TBPZP48, TBPZP53 and TSIO1, TSIO9, TSPZ11, TSPZ12, TSPZ13, TSPZ14, TSPZ18, TSPZP19, TSPZP20, TSPZP23, TSPZP26, TSPZP27, TBIO28, TBIO30, TBIO31 TBIO32, TBIO35, TBIO36, TBPZ37, TBPZ39, TBPZ40, TBPZ41, TBPZ44, TBPZ45, TBPZP47, TBPZP49, TBPZP50 with PDB code of protein 2XYJ and 5EWL and 2ZW3 and 4WR8 respectively. The molecules obtained binding score 5 to 7 are TSIO1, TSIO2, TSIO3, TSIO4, TSIO5, TSIO7, TSIO8, TSIO9, TSPZ10, TSPZ11, TSPZ12, TSPZ13, TSPZ14, TSPZ16, TSPZ17, TSPZ18, TSPZP20, TSPZP21, TSPZP24, TSPZP25, TSPZP26, TBIO28, TBIO29, TBIO31, TBIO32, TBIO33, TBIO35, TBIO36, TBPZ39, TBPZ40, TBPZ42, TBPZ43, TBPZ40, TBPZ42, TBPZ43, TBPZ44, TBPZ45, TBPZP47, TBPZP48, TBPZP49, TBPZP50, TBPZP52, TBPZP53, TBPZP54 and TSIO1, TSIO2, TSIO3, TSIO5, TSIO6, TSIO7 TSIO8, TSIO9, TSPZ10, TSPZ11, TSPZ13, TSPZ14, TSPZ15, TSPZ16, TSPZ17, TSPZ18, TSPZP19, TSPZP20, TSPZP21 TSPZP22, TSPZP23, TSPZP24, TSPZP25, TSPZP26, TSPZP27, TBIO28, TBIO29, TBIO30, TBIO31, TBIO33, TBIO34, TBIO35, TBIO36, TBPZ37, TBPZ38, TBPZ39, TBPZ40, TBPZ42, TBPZ43, TBPZ44, TBPZ45, TBPZP46, TBPZP47, TBPZP49, TBPZP50, TBPZP51, TBPZP52, TBPZP53 and TSIO1, TSIO4, TSIO7, TSIO8, TSIO9, TSPZ10, TSPZ11, TSPZ12, TSPZ13, TSPZ14, TSPZ15, TSPZ16, TSPZ17, TSPZ18, TSPZP19, TSPZP21, TSPZP22, TSPZP24, TSPZP25, TSPZP26, TSPZP27, TBIO28, TBIO29, TBIO30, TBIO31, TBIO34, TBIO35, TBIO36, TBPZ37, TBPZ38, TBPZ39, TBPZ40, TBPZ41, TBPZ42, TBPZ43, TBPZ44, TBPZ45, TBPZP46 TBPZP47, TBPZP49, TBPZP50, TBPZP52 and TSIO2, TSIO3, TSIO4, TSIO5, TSIO6, TSIO7, TSIO8, TSPZ10, TSPZ15, TSPZ16, TSPZ17, TSPZP21, TSPZP22, TSPZP25, TBIO29, TBIO33, TBIO34, TBPZ38 TBPZ42, TBPZ43, TBPZP46, TBPZP48 with PDB protein 2XYJ and 5EWL and 2ZW3 and 4WR8 respectively. The molecules obtained c-score greater than 7 and less than 10 are TSIO6, TSPZ15, TSPZP22, TSPZP23, TSPZP27, TBIO30, TBIO34, TBPZ37, TBPZP46, TBPZP51, PIO (STD) and the molecules TSIO1, TSIO2, TSIO3, TSIO4, TSIO5, TSIO6, TSIO7, TSIO9, TSPZ10, TSPZ11, TSPZ12, TSPZ13, TSPZ14,TSPZ16, TSPZ17, TSPZ18, TSPZP19, TSPZP20, TSPZP21, TSPZP22, TSPZP23, TSPZP25, TSPZP26, TSPZP27, TBIO28, TBIO29, TBIO31, TBIO32, TBIO33, TBIO34, TBIO36, TBPZ37, TBPZ38, TBPZ39, TBPZ40, TBPZ41, TBPZ42, TBPZ43, TBPZ44, TBPZ45, TBPZP46 TBPZP47, TBPZP48, TBPZP49, TBPZP50, TBPZP51, TBPZP52, TBPZP53, TBPZP54, PIO (STD) and the molecules TSIO6, TBIO33, TBPZP51 and the molecule TBPZP51 with PDB code of 2XYJ and 4 K41 and 2ZW3 and 4WR8 respectively. Pioglitazone was used as a standard; it obtained good binding score of 8.396 and 7.648with two PDB proteins 2XYJ and 4WR8 respectively.

Discussion

Pathological complications of T1D remains undetermined, hyperglycemia develops to play a key role [47]. Although T1D is rare compared to type 2 diabetics, it might affect in any part of the body, such as the eyes, kidneys, heart, peripheral and autonomic nervous systems [48]. Considering the poor prognosis of T1D, understanding the specific molecular biomarkers of the disease is important for initial diagnosis and therapy to increase survival rates. In this investigation, we examined the expression profiling by high throughput sequencing dataset GSE123658 including T1D group and healthy donors group to identify the molecular mechanism of T1D and seek some molecular biomarkers. Bioinformatics analysis of these biological factors is applied to explore genes that are favorable to treatment.

In total 284 DEGs, 142 up regulated and 142 down regulated genes were identified. Polymorphic gene ARMS2 has an important role in the advancement of T1D [49]. Polymorphic gene AS3MT was reported to be associated with progression of T1D [50]. Wang et al [51] also reported that CRHR2 are abnormally expressed in hypertension patients, but this gene might be induces hypertension in T1D patients. Investigation has indicated that RNF122 decrease expression is associated with hyperactivity disorder [52]; this finding is consistent with our results and indicates that RNF122 might be involved in the development of T1D. A previous investigation found that autophagy regulating TP53INP2 gene expression was associated with development of T1D [53]. Polymorphic gene CRTC2 is well known for its critical role in type 2 diabetes [54], but this polymorphic gene might be responsible for advancement of T1D.

GO and REACTOME pathway enrichment analyses were performed to explore interactions among the DEGs. These DEGs were mainly enriched in cell-cell signaling, integral component of plasma membrane, signaling receptor activity, signaling by GPCR, vesicle fusion, whole membrane, lipid binding and innate immune system. Altered MYH6 gene was known to play a role in congenital heart defects [55], but this gene might be induces T1D in patients with congenital heart defects. McKenna et al [56] found that AVP (arginine vasopressin) play essential roles in the T1D. Yue et al [57] concluded that high GRIA4 expression was correlated with abdominal aortic aneurysm progression, but this gene might be answerable for advancement of T1D in patients with abdominal aortic aneurysm. Kochetova el at [58] found that GRIN2B is a significant biomarker for type 2 diabetes compared to healthy controls, but it was confirmed for the first time in our study that GRIN2B expression in T1D indicates a good prognosis. Recent investigations demonstrated that DCC (DCC netrin 1 receptor) gene can mediate angiogenesis and plays an important role in diabetic kidney disease [59]. Ruiz de Azua et al [60] indicated that RGS4 was associated progression of type 2 diabetes, but this gene might be essential for T1D progression. A previous investigation demonstrated that GJA1 was associated with the progression of atrial fibrillation [61], but might be important gene signatures of T1D in patients with atrial fibrillation. Faienza et al [62] and Galán et al [63] demonstrated that, GREM1 and EGFR (epidermal growth factor receptor) might be important for predicting treatment response in T1D. TRHR (thyrotropin releasing hormone receptor) is considered a potential biomarker of hypertension [64], but this gene might be associated with development of T1D in patients with hypertension. PTPRT (protein tyrosine phosphatase receptor type T) has been reported to serve a role in obesity associated insulin resistance [65], but this gene might be involved in progression of T1D. Research has demonstrated that FCAMR (Fc fragment of IgA and IgM receptor) gene could contribute to progression of atherosclerosis [66], but this gene might be crucial for progression of T1D in patients with atherosclerosis. Sun et al. [67] revealed that CCL19 was associated with diabetic nephropathy in patients with T1D. DSP (desmoplakin) has been reported to be expressed in cardiomyopathy [68], but this gene might be linked with progression of T1D in patients with cardiomyopathy. Xu et al. [69] have demonstrated that ITIH4 are related with coronary heart disease, but this gene might be responsible for advancement of T1D in patients with coronary heart disease. Miyashita et al [70] demonstrate that GAB2 plays a role in Alzheimer disease progression, but this gene might be crucial for T1D in patients with Alzheimer disease. McCann et al. [71] showed that IGF2R played an important role in T1D. Previous studies reported that the expression of MYADM (myeloid associated differentiation marker) induced hypertension [72], but this gene might be liable for advancement of T1D in patients with hypertension. Fan et al. [73] and Chan et al. [74] revealed that TPCN2 and APOL1 were associated with type 2 diabetes, but these genes might be responsible for development of T1D. Previous investigation had confirmed that MEFV (MEFV innate immuity regulator, pyrin) play critical roles in ischemic heart disease [75], but this gene might be essential for progression of T1D in patients with ischemic heart disease. Wang et al [76] state that the expression of DTX4 is important event in obesity, but this gene might be linked with progression of T1D in patients with obesity.

We analyzed the protein–protein interactions (PPI) and modules of the DEGs involved in T1D. Table 5 summarizes the PPI network hub genes (five up regulated and five down regulated) that were identified in the T1D, which included EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN. A recent study showed that protein expression levels of CAP2 were up regulated in cardiomyopathy patients [77], but this gene might be responsible for T1D in patients with cardiomyopathy. MIF (macrophage migration inhibitory factor) [78] and KIF1A [79] were reported to be associated with T1D. Recent research suggested that PXN (paxillin) is involved in hypertension [80], but this gene might be associated with progression of T1D in patients with obesity. POLR2A, PRKACA, GABARAP, TLN1 and CIITA (class II major histocompatibility complex transactivator) might be considered as a novel biomarkers associated with the development of T1D.

Target gene - miRNA regulatory network and target gene - TF regulatory network analysis demonstrated that DEGs interacted directly or indirectly. The more edges associated with genes, miRNAs and TFs indicated the more potential selection for the targets. Table 6 summarizes the target gene - miRNA regulatory network and target gene - TF regulatory network (target genes, miRNAs and TFs) that were identified in the T1D, which included GRIN2B, EGFR, DKK1, GJA1, RGS4, TLN1, IGF2R, POLR2A, ARHGAP1, HIP1, RGS4, EYA1, CCL19, PRL, PRKACA, GAB2, HIP1, PXN , RGL2, hsa-mir-4257, hsa-mir-564, hsa-mir-587, hsa-mir-941, hsa-mir-561-3p, hsa-mir-4300, hsa-mir-5694, hsa-mir-378b, hsa-mir-3918, hsa-mir-6719-3p, FOXD1, GATA2, FOXL1, TP53, JUND, STAT3, TFAP2A, KLF5, PPARG and HINFP. STAT3 had been reported to be involved in the pathogenesis of T1D [81]. Polymorphic gene GATA2 has been reported to be crucial for the progression of coronary artery disease [82], but this gene might be responsible for advancement of T1D in patients with coronary artery disease. PRKACA (protein kinase cAMP-activated catalytic subunit alpha), DSP (desmoplakin), hsa-mir-4257, hsa-mir-564, hsa-mir-4300, hsa-mir-5694, RGS4, FOXD1, EYA1, TFAP2A and GAB2 might be considered as a novel biomarkers associated with the development of T1D.

In addition, we also performed validation of hub genes by ROC analysis and RT-PCR. Results showed that these hub genes differentiated T1D group from healthy donors group, and may be candidates for diagnostic biomarkers and new therapeutic target. Moreover, EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN are involved in progression of T1D.

Among all few molecules of TSIO8, TSPZ15, TSPZP24, TBIO30, TBIO35 (Fig.7) obtained excellent binding score of 11.084 with PDB code 4 K41, the values are depicted in Table 7. The molecule TSIO8 (Fig. 8) has highest binding score its interaction with protein 4 K41, and formed hydrogen bond interaction of 4″’ –NH2 group with CYS-217, the hetero atom 1 oxygen and 3″ aromatic hydroxyl group (−OH) formed hydrogen bonding interactions with same amino acid LYS-213. The 3″ hydroxyl group also farmed hydrogen bond interaction with ASP-157 respectively. The sulphur of thiazolidinone ring 1′-S formed two hydrogen bond interactions with different amino acids LEU-16 & GLY-15 and 4′-C ring carbonyl formed interaction with GLY-302 respectively. The molecule also formed pi-pi interactions of electrons of aromatic ring with TYR-306, and pi-sigma interaction with GLU-214 and 2′ C=O of thiazolidine ring formed unfavourable interaction with calcium 401 (Ca) following 2′ C=O formed carbon-hydrogen interaction with GLY-13, All interactions with amino acids and metal are depicted by 3D (Fig.9) and 2D (Fig.10) figures.

Fig. 7
figure 7

Structures of Designed Molecules

Table 7 Docking results of Designed Molecules on Over Expressed Proteins
Fig. 8
figure 8

Structure of the Designed Molecule (TSIO8) obtained Highest Binding Score with PDB

Fig. 9
figure 9

3D Binding of Molecule TSIO8 with 4 K41

Fig. 10
figure 10

2D Binding of Molecule TSIO8 with 4 K41

In conclusion, the present investigation was designed to identify DEGs that may be involved in the progression of T1D. A total of 284 DEGs and 10 hub genes were identified and might be regarded as diagnostic biomarkers and new therapeutic target for T1D. Together, EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN might be effective targets in T1D, but more experimental investigations and clinical trials are needed.