Background

Familial hypercholesterolemia (FH) is one of the commonest inherited metabolic disorders with limited therapies characterized by an abnormally high level of low-density lipoprotein cholesterol (LDL-C) in blood that has been definitely associated with a premature atherosclerosis onset and a high risk of cardiovascular disease (CVD) [1, 2]. Historically, the incidence of heterozygous FH was about 1 in 500 persons [3]. It’s reported that this number may be as high as 1 in 100 in some European and several South African populations [4]. There are about 3.8 million potential FH patients in China, whereas the clinical and genetic data of FH are limited [5, 6]. Studies show that heterozygous FH has the uppermost prevalence of genetic flows that cause prominent premature mortality. Goldstein and Brown in their early work first recognized the genetic basis of the disorder, impaired functioning of the low-density lipoprotein (LDL) receptor [7]. Researches of LDL receptor function have revealed additional mechanisms for the pathogenesis of FH (defects in apoli-poprotein [apo] B impairing binding with the LDL receptor and gain-of-function mutations in proprotein convertase subtulisin/kexin type 9 [PCSK9] that enhance LDL receptor degradation) [8]. In addition, there were many different types of LDLR mutation ascertained in sufferers with FH globally. For instance, large gene mutations and rearrangements took place in the promoter region that impact gene transcription [9]. Nevertheless, the molecular mechanism of atherosclerosis in patients with FH is not completely understood, and FH still acts as a proven vital risk factor for the development of atherosclerosis even coronary heart disease. For the therapy, first-line treatment for patients with heFH is with statins which can decline risk of CHD in heFH by up to about 80% while started as a preventive treatment in early age [10]. However, the long-term safety of statins starting at young age in the pediatric population remains unknown because of the non-functional low-density lipoprotein receptor [1]. Linda Omer et al. indicated that CRISPR/Cas9 mediated gene-editing was likely to be a cutting-edge technology to amend gene mutations attributed to diseases, sequentially ameliorating the symptoms of the sick at risk for CVD [11]. Whereas there is still a substantial residual cardiovascular and inflammatory danger of developing CVD that persists after treatment, especially in patients with FH. These realities have pushed forward the search for new therapies against FH, including novel pharmaceutical drugs or genetic engineering technologies.

In the past few decades, technology of the gene chip research and bioinformatic analysis have been wildly applied to screen genetic alterations on genomic level [12,13,14]. As is well known, bioinformatics mainly focuses on genomics and proteomics. It analyzes the biological information on structural function in the nucleic acid and protein sequence and seeks out genes and proteins related to diseases [15, 16]. At present, increasing researchers utilized bioinformatics to find the potential molecular mechanisms of diseases related to the targeted treatment. In this study, microarray datasets GSE13985 and GSE6054 were obtained from Gene Expression Omnibus (GEO) and analyzed to obtain differently expressed genes (DEGs) between FH patients and controls. The sample data was re-analyzed using various bioinformatic methods such as DEGs screening, functional enrichment analysis and protein-protein interaction network analysis. We hope to identify the potential markers in FH patients, and explore specific targets that could prevent the progression of atherosclerosis.

Methods

Collection of raw data

Gene Expression Omnibus (GEO) comprised various species’ microarrays, gene expression data, and chips, is an open-source, high-throughput genomic database [17]. Two expression profile data sets GSE13985 and GSE6054 in our study were obtained from the GEO database. The RNA expression profiles were both assayed on GPL570 platform, [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. GSE13985 data set includes 5 blood samples from patients diagnosed with Familial hypercholesterolemia and five from age, sex, BMI and smoking status matched controls. The GSE6054 date set contains 10 FH monocytes samples and 13 control participants. We converted all probe numbers to gene symbols on the base of the annotation information in the platform. As those data were acquired from a public database, no further approval from the local ethics committee was required.

Data preprocessing

In order to analyze and process the chip data more conveniently, the primary data were preprocessed using affy package in R language. Next we matched gene probe identification to the corresponding gene symbol. Series matrix files were extracted to assess mRNA expression, and mRNA-seq datasets preprocessed by quantile normalization or log2 transformation. In addition, probes were annotated employing the annotation profile from the platform, and unmatched probes were waived. While multiple probes matched to one gene symbol, the probes’ average values were calculated to be the genetic final expression [18].

Screening genes of differential expression

DEGs between FH patients and those of matched controls of the two expression profile data sets were screened out respectively using the Linear Models for Microarray (LIMMA; Version: 3.30.3) affy in R package12 [19]. P value < 0.05 and (|log2FC| > = 0.5) were defined as threshold values in gene sets GSE13985 and P value < 0.05, |log2 (FC)| > =1 in GSE6054. Subsequently we select the common up-regulated and down-regulated DEGs from two datasets.

Functional and pathway enrichment analysis of DEGs

In order to research the biological functions and pathways of these identified DEGs, we performed the GO term and KEGG pathway enrichment analyses of DEGs using the online tool of The Database for Annotation, Visualization, and integrated discovery (DAVID (https://david.ncifcrf.gov/home.jsp version: 6.8)) [20]. DAVID provides a comprehensive assortment of functional annotation system for explorers to screen biological meanings behind numerous genes. By making use of DAVID, and the categories including biological process (BP), cellular component (CC), molecular function (MF) and KEGG pathways were selected for further analysis.

Integration of the PPI network and hub gene analysis

Using the Search tool for the retrieval of interacting genes/proteins (STRING) (https://string-db.org/) online database, PPIs network among DEGs were constructed with the threshold of medium confidence > = 0.3. Utilizing topological principles, Molecular complex detection (MCODE) (version 1.5.1), a plug-in for Cytoscape, could mine tightly coupled regions from PPIs. Cytoscape software draws the PPI network. Then MCODE identifies the most important modules in the PPI network graph. The score of each module was calculated using the MCODE algorithm [21]. The criteria for MCODE analysis are as follows: node score cutoff = 0.2, degree cutoff = 2, max depth = 100, MCODE score > 5, and k-score = 2.

Identification of TF targets

Transcription factor networks were constructed employing the differentially expressed data which reference to the collected validated data via several databases [22]. TF targets were extracted from TRANSFAC database. The regulatory interactions between TF and genes were obtained via Python script. Based on DAVID, regulatory relationships between TFs and targeted-DEGs were predicted using Enrichr, and the TF-target regulatory networks were visualized by Cytoscape.

Results

Identification of DEGs

One thousand four hundred forty-five DEGs were identified from dataset GSE13985 comparing FH group to control group (Fig. 1a). Among these, 452 DEGs were up-regulated and 993 DEGs were down-regulated (adjust p value < 0.05 and |log2FC| > =0.25). Simultaneously, we identified 2056 DEGs containing 1344 up-regulated DEGs and 712 down-regulated DEGs from GSE6054 (adjust p value < 0.05 and |log2FC| > =1) (Fig. 1b). Then, we screen the mutual up-regulated and down-regulated DEGs between GSE13985 and GSE6054, the VENN plot of the results displayed that there were 49 DEGs up-regulated and 53 DEGs down-regulated in both data sets (Fig. 1c,d).

Fig. 1
figure 1

a Clustered heat map of DEGs between FH and control samples in GSE13985. b Clustered heat map of DEGs between FH and control samples in GSE6054. The abscissa represents different samples, and the ordinate represents different genes. The red boxes indicate up-regulated genes, and the green boxes indicate down-regulated genes. c The volcano plot shows the DEGs between FH and control samples in GSE13985. d The volcano plot shows the DEGs between FH and control samples in GSE6054. e The Venn diagram presents that there are a total of 49 upregulated genes that are simultaneously included in the 2 datasets. f The Venn diagram presents that there are a total of 53 downregulated genes that are simultaneously included in the 2 datasets. DEGs: Differentially Expressed Genes

Functional annotation of DEGs through GO and KEGG analysis

To uncover the biological classification of DEGs, GO functional and KEGG pathway analyses were performed based on DAVID. GO analysis results demonstrated that the six most enriched biological process annotations were blood coagulation, cell-cell adhesion, ER to Golgi vesicle-mediated transport, integrin-mediated signaling pathway and neural tube closure. Changes obviously enriched in cell component (CC) of DEGs were mainly enriched in cytoplasm, plasma membrane, extracellular exosome, nucleoplasm and membrane. Changes in molecular function (MF) were significantly enriched in protein binding, poly(A) RNA binding, cadherin binding involved in cell-cell adhesion and transmembrane signaling receptor activity (Fig. 2a). The KEGG pathway analysis showed the DEGs were enriched in pathways associated with focal adhere and glucagon signaling pathway (Fig. 2b). More detailed results of GO and KEGG analyses are provided in Table 1.

Fig. 2
figure 2

GO terms and KEGG pathway enrichment. a GO enrichment analyses of the top six DEGs. b The KEGG pathway analysis of DEGs. GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes

Table 1 The functional enrichment analyses of DEGs ranked by P-value

PPI network and hub genes analysis

Construction of the Protein-protein interactions networks among the DEGs and identification of the most significant modules were performed using the online tool STRING with a cutoff score of ≥0.3 and adjusted through Cytoscape. In total, 65 edges and 48 nodes were involved in our PPI network (Fig. 3a). Utilizing the cytoHubba plugin Cytoscape, a total of 10 genes (ITGAL, TLN1, POLR2A, CD69, GZMA, VASP, HNRNPUL1, SF1, SRRM2, ITGAV) were identified as hub genes with degrees≥6 (Fig. 3b). The names, abbreviations and functions for these hub genes are shown in Table 2. The most significantly enriched BPs containing cell adhesion, integrin-mediated signaling pathway and cell-matrix adhesion. The changes of CCs showed that DEGs were mainly enriched in the nuclear speck, cytoskeleton, catalytic step 2 spliceosome and Cajal body. The changes of MFs showed that DEGs were mainly enriched in poly(A) RNA binding (Fig. 3c).

Fig. 3
figure 3

a PPI network of signifcantly diferentially expressed genes. Up-regulated genes are marked with light red; down-regulated genes are marked with light green. b The top 10 hub genes selected from PPI network. c GO enrichment analyses of the hub genes. PPI, protein-protein interaction

Table 2 Summary of the function of 10 hub genes

Analysis of TF-target regulating networks

In the TF-target regulating network, 219 nodes were detected, including 214 DEGs and 5 transcription factors (TFs) (SP1,EGR3,CREB,SEF1,HOX13) (Fig. 4). Obviously, it is creditable to recognize that the TFs play a main regulatory role in network. Almost these 5 proteins encoded is zinc finger transcription factors that binds to several kinds of motifs of many promoters. We speculate that the predicted transcription factors may affect the process of early-onset atherosclerosis in familial hypercholesterolemia by activating or inhibiting transcription of these related differentially expressed genes.

Fig. 4
figure 4

The TF-target regulating network including 214 DEGs and 5 transcription factors. Up-regulated genes are marked with light red; down-regulated genes are marked with light green, transcription factors are marked with blue

Discussion

FH is the most common genetic reason of cardiovascular disease which leads to premature atherosclerotic cardiovascular disease because of lifelong exposure to ascending low-density lipoprotein cholesterol (LDL-C) levels [23]. This genetic disorder influences the metabolism of low-density lipoprotein cholesterol (LDL-C), reducing the liver clearance of cholesterol-loaded LDL particles in the blood [24]. LDLR binds its low-density lipoprotein (LDL) particles to the plasma membrane, internalizes it, and then releases it in the low pH environment to degrade ribosomes, and enables cholesterol to occupy the microsomal enzyme 3-hydroxy-3- Methylglutaryl coenzyme A (HMG CoA) reductase, which is the rate-limiting step in cholesterol synthesis [25]. Plenty of experimental and epidemiological researches have proven the causal relationship of low-density lipoproteins (LDL) in the evolution of atherosclerosis and in the incidence of atherothrombotic complications like coronary heart disease (CHD) [26]. Even though the awareness of FH is increasing, this potentially fatal, treatable condition still remains underdiagnosed and undertreaded. Some traditional CVD risk factors ubiquitously exist in FH patients and already have been independently associated with CVD danger in the FH population. It is reported that some genetic factors such as single nucleotide polymorphisms (SNPs) and genetic variants like telomere length alteration in somatic cells, have been proven to predict the FH phenotype and CVD prognosis. Besides, certain circulating molecules, which play different roles in regulating the process of atherosclerosis, have been described as surrogate markers of CVD risk in FH populations. Hence understanding the changes in FH gene expression is of critical importance towards understanding the mechanism of disease progression and predicting the diagnostic or therapeutic targets of FH.

Bioinformatic technology has been generally applied to search for genes and molecules connected with the occurrence and development of relevant diseases and is regarded to be a promising technology for seeking targeted treatments. We can use the technology to find disease-related data from large open-sourced databases for analysis and identify the genes that are most concerned to the diseases. In present study, 102 mutual DEGs were identified in FH samples compared with healthy samples with 49 upregulated genes and 53 downregulated genes. By analyzing the PPI network, the 10 hub DEGs ITGAL, TLN1, POLR2A, CD69, GZMA, VASP, HNRNPUL1, SF1, SRRM2 and ITGAV were selected, among which ITGAL, TLN1, POLR2A, VASP, HNRNPUL1, SF1 and SRRM2 were expressed at higher levels while the expression levels of CD69, GZMA and ITGAV were lower in FH patients. The KEGG pathway analysis revealed that DEGs were significantly enriched in focal adhesion and glucagon signal pathway. Cell adhesion genes (ITGAL, TLN1), Poly(A) RNA binding-related genes (HNRNPUL1, SRRM2, SF1) and protein homotetramerization genes were enriched in these pathways. These significant DEGs and their functions were theorized to contribute to atherosclerosis development in FH patients.

The main risk of FH patients is the early onset of atherosclerosis and cardiovascular disease. Atherosclerosis is characterized by blood vessel wall hyperplasia, lipid accumulation in blood vessel wall, cytokine-activated macrophage invasion of blood vessel wall and formation of macrophage foam cells [27]. ITGAL belongs to the integrin α chain family, which encodes the integrin αL chain, and plays a role in T cell activation mainly through the contact of T cell receptors with antigens that bind to MHC molecules on antigen presenting cells [28]. Although no studies have reported that ITGAL is directly related to the development of FH or atherosclerosis. Previous studies have demonstrated that Ac-LDL uptake and TNF-α-dependent increase in THP-1 cells and the expression of OLR1, NOX2, NCF1, ITGA4 and ITGAL, suggest that ITGAL regulates Ac-LDL uptake and affects the formation of foam cells in macrophages [29]. We observed that GO annotations related to this gene include protein heterodimer activity and cell adhesion molecule binding, which may be related to abnormal immune cell adhesion or accumulation of metabolites in blood vessels. In addition, there are reports that using ItgaL−/−null NOD/LtJ mice, genetic defects of ItgaL can prevent the occurrence of hyperglycemia. Animal experiments have shown that lack of ItgaL can prevent insulin resistance, while lack of Itgb2 can provide protection. Transferring splenocytes lacking ItgaL to NOD/Rag-1 experimental mice does not lead to the development of diabetes, which suggests that ItgaL has a role in NOD/LtJ T cell activation [30]. John H Chidlow Jr. et al. used an external parallel plate flow chamber model and it was found that gene deletion of ItgaM completely prevented neutrophils from agglutinating into the endothelium stimulated by VEGF-A, while lack of ItgaL only weakened the adhesion of neutrophils. The lack of ItgaM does significantly reduce the rolling of neutrophils, but the lack of ItgaL does not. They also found that genetic defects in ItgaL or ItgaM do significantly inactivate T cell adhesion to VEGF-A-stimulated colonic endothelium [31]. This means that the ITGAL gene we identified may be related to the abnormal deposition of atherosclerotic endothelium.

TLN1 are reported to be associated with important biological processes, including platelet degranulation, muscle contraction, cytoskeletal anchoring at plasma membrane, cell-cell junction assembly and cell-substrate junction assembly [32]. This gene probably involved in connections of major cytoskeletal structures to the plasma membrane. High molecular weight cytoskeletal protein concentrated at regions of cell-substratum contact and, in lymphocytes, at cell-cell contacts. Diseases associated with TLN1 include Leukocyte adhesion deficiency, type I and Leukocyte adhesion deficiency, type Iii. As we all known, the universally expressed cytoskeletal protein talin (Tln) is a constituent of muscle costameres that connects integrins ultimately with the sarcomere. And there are two talin genes, Tln1 and Tln2 expression where Tln2 is the dominant isoform [33]. A study tested the function of both two Tln forms in myocardium in postnatal CMs. Recent studies in non-muscle cells have also found that Tln is a key regulator of force transmission and transduction. This is a particularly important feature of the heart muscle. The myocardium is an organ that is continuously subjected to mechanical force under basic conditions and must adapt to mechanical changes under physiological pressure or pathological conditions [34, 35]. Interestingly, researchers found that global deletion of Tln2 in mice had no structural or functional variations in heart, perhaps on account of up-regulated CM Tln1 [32]. The results revealed that CM Tln2 was indispensable for appropriate β1D-integrin expression and that presumably Tln1 could take the place of Tln2 in preserving heart function, however, that lack of both Tln forms from the heart-muscle cell resulted in myocyte instability and a dilated cardiomyopathy.

In addition, our present analysis as well allowed the identification of some TFs (SP1,EGR3,CREB,SEF1,HOX13) associated with FH, which suggests that these genes play important roles in FH. Based on the current literature, we discuss below the association between FH and the transcription factors identified herein. The protein encoded by SP1 is also a zinc finger transcription factor that combines with GC-rich motifs of many promoters. Besides SP1 can activate or repress transcription in answer to physiological and pathological stimuli. It binds with high affinity to GC-rich motifs and regulates the expression of numerous genes involved in various processes such as cell growth, apoptosis, differentiation and immune responses [36]. We speculate that the expression level of sp1 may regulate the calcification of collagen in atherosclerotic plaques. Interestingly, it has been demonstrated that unstable (known as noncalcified) plaques undergo thinning of the fibrous cap prior to rupture, possibly as a result of macrophages releasing proteolytic matrix-degrading enzymes which may degrade the fibrous cap3 It’s reported SP1 was highly regulated by post-translational modifications (phosphorylations, sumoylation, proteolytic cleavage, glycosylation and acetylation) and also bond the PDGFR-alpha G-box promoter [37]. Besides, this transcription factor may have a role in modulating the cellular response to DNA damage. According to the latest reports, it was found that because of the descending recruitment of SP1 to SCARB1 promoter the SCARB1 was downregulated by DNMT3b. In our view, this discovery will provide novel insight into an underlying mechanism for atherosclerosis of FH [38]. Another transcription factor EGR3, Early Growth Response 3, remains with the EGR family of C2H2-type zinc-finger proteins. It is reported EGR3 was an immediate-early growth response gene which was induced by mitogenic stimulation and it functioned in a wide variety of processes including muscle development, lymphocyte development, endothelial cell growth and migration, and neuronal development. Previously reported that diseases associated with EGR3 include bipolar I disorder and chondromalacia of patella. Based on research fruits, among its related pathways are Circadian rhythm related genes and Calcineurin-regulated NFAT-dependent transcription in lymphocytes. Jun-ichi Suehiro et al. displayed that in HUVECs, Egr-3 showed more pronounced, delayed, and sustained induction in contrast with Egr-1. Furthermore, deletion of Egr-3 remarkably vitiated the proliferation, migration, and tube formation of endothelial cells and hindered monocyte adhesion mediated by VEGF. From the above, these findings suggest that Egr-3 plays a critical role of VEGF signaling in activated endotheliocytes. So EGR3 is likely to be a potential therapeutic target for a preventive against vasculopathic diseases. CREB gene encodes a transcription factor that is a member of the leucine zipper family of DNA binding proteins. This protein binds as a homodimer to the cAMP-responsive element, an octameric palindrome. The protein is phosphorylated by several protein kinases, and induces transcription of genes in response to hormonal stimulation of the cAMP pathway. It is a phosphorylation-dependent transcription factor, which is Involved in different cellular processes including the synchronization of circadian rhythmicity and the differentiation of adipose cells. The suppressor of essential function 1 (SEF1) is a zinc finger transcription factor and this fungal transcription factor regulates genetic middle homology region. Some studies reported that Sef1 reacted to deficient Fe-S cluster synthesis via regulated changes in its subcellular location; it was maintained in the nucleus resulting in the revulsive expression of the iron regulon [39]. The homeobox transcription factor Hox13 is a member of the Hox family containing homeobox genes and encodes DNA binding proteins. In previous studies, the structure, genomic organization, expression patterns and biological functions of the Hox family are highly conserved [40]. In vertebrates, axial Hox expression was observed in neural tubes and some paraxial mesoderm derivatives, while in arthropods, in the ventral nerve cord, visceral mesoderm and Hox gene expression was found in the epidermis [41]. Albeit no reports were seen about HOX13 has a regulatory relationship with FH or atherosclerosis, we believe that it is necessary to further study the underlying mechanism of hox13.

Even though the rigorous bioinformatic analysis was carried out in present study, there are still some weaknesses. The quantity of data in this study is limited so that some deviations may exist in the results. Enlarging the samples can enhance the accuracy of the analysis findings. Beyond that, despite it can be explained to some degree that the hub molecules and TFs are closely associated with the development of FH and may also function as potential markers for therapeutic targets, specific mechanism researched are still of great necessary on animal or cell experiments.

Conclusion

Above all, findings in the current study demonstrated that the development of atherosclerosis might be the result of imbalance between macrophages and fibrosis. Specifically, up-regulated ITGAL, TLN1, POLR2A, VASP, HNRNPUL1, SF1, SRRM2, and down-regulated, CD69, GZMA,ITGAV performed important promotional effects for the formation of atherosclerotic plaques those with FH. Moreover, (SP1, EGR3, CREB, SEF1, HOX13) were the potential transcription factors for DEGs and could serve as underlying targets for AS rupture prevention. These findings provide a theoretical basis for us to understand the potential etiology of the occurrence and development of AS in FH patients and we may be able to find potential diagnostic and therapeutic targets.