Introduction

Severe Acute Respiratory Syndrome Coronavirus Type 2 (SARS-CoV-2) is an enveloped single-stranded RNA virus which belongs to the family of coronaviruses and the severe acute respiratory syndrome-related coronavirus (Gorbalenya et al. 2020). As the seventh known coronavirus, it can be transmitted from person to person (Anderson et al. 2020). According to the latest data from Johns Hopkins University in the United States, as of 6:35 a.m. Beijing time, July 23rd, 2020, the cumulative number of confirmed cases and deaths of new coronavirus disease 2019 (COVID-19) worldwide has exceeded 15 million and 620,000, respectively. Currently, the number is still rising rapidly.

The respiratory secretions or droplets of infected people, if inhaled, will cause infection in healthy people. SARS-CoV-2 can spread among people through close contact (Anderson et al. 2020; Shereen et al. 2020; Xu et al. 2020). Fever and cough are frequently seen in patients, but some patients may develop myalgia and fatigue (Xu et al. 2020; Huang et al. 2020). Although the patients mainly show the symptoms of the respiratory system, in fact, the virus can damage many organs at the same time, such as heart, kidney, intestine, brain and liver (Dhakal et al. 2020). It is reported that 23% of severe patients were diagnosed with cardiac injury, a common complication that would exacerbate the disease in its severity and progressing, which includes palpitations or chest pain, myocarditis, acute myocardial infarction (AMI), heart failure (HF), arrhythmia, as well as venous thromboembolic events (VTE) (Yang et al. 2020; Chen et al. 2020; Long et al. 2020; Rizzo et al. 2020). Among them, the AMI and arrhythmia can be seen in patients with COVID-19, and it is more common in ICU patients (Bansal 2020). The receptor-binding domain (RBD) of SARS-CoV-2 might bind tightly with human ACE2, which was strongly expressed in pericytes of human hearts, indicating the potential susceptibility of cardiomyocytes to SARS-CoV-2 (Chen et al. 2020). Additionally, the expression of ACE2 was highly increased in the heart failure patient (Chen et al. 2020). In the direct mechanism, the myocardial injury can be caused by the infiltration of viruses into myocardial tissues, resulting in the death and inflammation of cardiac muscles. In the indirect mechanism, cardiomyocyte stress secondary to pulmonary failure, hypoxemia, and other cardiac inflammation can lead to severe systemic inflammatory response syndrome (SIRS) (Akhmerov and Marbán 2020). The preliminary clinical studies have developed several drugs of therapeutic potential, such as antiviral drugs, nucleoside analogues, neuraminidase inhibitors, therapeutic peptide, RNA synthesis inhibitors, anti-inflammatory drugs, chloroquine phosphate, Lianhuaqingwen capsules and ShuFengJieDu capsules as well (Chakraborty et al. 2020; Lu 2020; Gao et al. 2020).

The high-throughput sequencing technology provides the information of genes, pathophysiological mechanism, diagnosis and therapy, as well as the method of antiviral prevention and vaccination development (Chan et al. 2020). It not only shortens the experimental cycle and reduces the cost of experiments, but also significantly improves the application value of species (such as SARS-CoV-2), the databases of which need to be improved due to the scarcity of their biological information (Pareek et al. 2011).

The exact mechanism of myocardial damage caused by COVID-19 remains unclear. In this study, the gene expression data of human-induced pluripotent stem cell-derived cardiomyocytes (iPSC) infected by SARS-CoV-2 was analyzed through bioinformatics analysis, the result of which is hopefully helpful for studying the mechanism of the infection of myocardium cells by SARS-CoV-2. It is of great significance for the establishment of cardiac antiviral drug screening platform (Bojkova et al. 2020), and helpful to promote the development of clinical targeted drug therapy. The aim of our study is identifying hub genes and signaling pathways that SARS-CoV-2 infected cardiomyocytes, which may be available to screen appropriate and effective biomarkers for further research.

Methods

Data Sources

The gene expression profiles of the GSE150392 dataset was downloaded from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) database of the National Center for Biotechnology Information (NCBI). The database was designed to stockpile primitive sequences, gene expression datasets and platform records. Based on the platform of [GPL18573 Illumina NextSeq 500 (Homo sapiens)], the dataset contains six human iPSC-cardiomyocytes samples, among which the GSM4548303-GSM4548305 served as the infection group while the GSM4548306-GSM4548308 as the control group.

Data Processing

The R software was applied to calibrate and standardize the high-throughput sequencing data of the GSE150392 dataset that was downloaded from the GEO database. The limma package of the R software was employed to screen differentially expressed genes (DEGs) with conditions of |log2FC|≥ 2 and P < 0.05, the pheatmap package to construct the heat-map of DEGs, and the ggplot2 package (Ito and Murphy 2013) to construct the volcano map.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Analysis of DEGs

The function and pathway enrichment analysis was employed to explain the biological functions of DEGs. The GO analysis, consisting of three categories of cellular component (CC), biological process (BP), molecular function (MF), was applied to describe the characteristics of genes and gene products (Fan and Wei 2020). The KEGG analysis, a public integrated database of biological systems designed for the systematic analysis of gene functions, rendered it possible to understand the pathways which have changed significantly under experimental conditions. The clusterProfiler package in the R software was applied to conduct the GO and KEGG analysis of DEGs, the results of which would be regarded as statistically significant if P < 0.05.

Construction of Protein–Protein Interaction (PPI) Network and Identification and Validation of Hub Genes

The DEGs were uploaded to the STRING website (Szklarczyk et al. 2017) (http://string-db.org/) to establish a PPI network that aimed to analyze the relationship between proteins in DEGs. The visualization and hub-genes screening were performed by Cytoscape software (Su et al. 2014), on which the cytoHubba plug-in was installed to calculate the score of each gene. According to Maximal Clique Centrality (MCC), the 12 genes with the highest degree were identified as hub genes. To confirm the findings from the bioinformatics analysis, the single-cell RNA sequencing’s data of the peripheral blood mononuclear cells (PBMCs) from the GSE150728 dataset and the high-throughput sequencing’s data of human embryonic stem cell-derived cardiomyocytes (hESC-CMs) from the GSE151879 dataset were applied to verify the hub genes. The GraphPad Prism software was used to analyze the data from the GSE150728 dataset and GSE151879 dataset, and P < 0.05 indicated the significant difference.

Construction of gene and miRNA Interactions network

The microRNAs (miRNAs), short non-coding single-stranded RNAs, play a part in pathogenesis of diseases in which they can control a variety of biological functions by regulating gene expression (Chou et al. 2018; Mitash et al. 2020; Saçar Demirci and Adan 2020). After uploading the gene list to the Network Analyst website, the miRTarBase v8.0 database was used to perform network analysis (Chou et al. 2018), which collects information about miRNA-target interactions (MTI) verified by experiment.

Results

Identification of DEGs

Totally 707 DEGs have been screened out according to the criteria of |log2FC|≥ 2 and P < 0.05, including 516 up-regulated genes and 191 down-regulated genes, the results of which were presented in the form of volcano map and heat map (Fig. 1A, B) with the help of ggplot2 package and pheatmap package in the R software.

Fig. 1
figure 1

Volcano map and heat map of differentially expressed genes (DEGs). A Red dots indicated up-regulated genes and blue dots indicated down-regulated genes. Black dots indicated the rest of the genes with no significant expression change. The threshold was set as followed: P < 0.05 and |log2FC|≥ 2. FC: fold change. B Gene expression data is converted into a data matrix. Each column represents the genetic data of a sample, and each row represents a gene. The color of each cell represents the expression level, and there are references to expression levels in different colors in the upper right corner of the figure

GO Enrichment Analysis of DEGs

The GO enrichment analysis denoted that the parts playing a role of significance in BP section are response to type I interferon, type I interferon signaling pathway, cellular response to type I interferon, etc. (Fig. 2A). As for CC section, sarcomere, contractile fiber part, contractile fiber, I-kappa B/NF-kappa B complex, as well as myofibril rank top 5 (Fig. 2B). Additionally, the MF part presents that the top enrichment items are receptor ligand activity, receptor regulator activity, cytokine receptor-binding, DNA-binding transcription activator activity, RNA polymerase II-specific, and cytokine activity (Fig. 2C). More details of the top 5 significant items in BP, CC, and MF of GO enrichment can be seen in Table 1.

Fig. 2
figure 2

The GO enrichment analysis and KEGG pathways analysis. BP biological processes, CC cellular components, MF molecular function, KEGG kyoto encyclopedia of genes and genomes; GO gene ontology

Table 1 Significant enrichment of GO terms (top 5 respectively in biological process (BP), cellular component (CC) and molecular function (MF) according to P-value)

KEGG Pathway Enrichment Analysis of DEGs

The biological functions of DEGs were identified through the KEGG pathway enrichment analysis. The result of the analysis of is shown in Fig. 2D. According to the P-value, the top 10 enrichment pathways were obtained, including TNF signaling pathway, NF–kappa B signaling pathway, Legionellosis, Cytokine–cytokine receptor interaction, IL-17 signaling pathway, Measles, Influenza A, Epstein-Barr virus infection, NOD-like receptor signaling pathway and Pertussis, as shown in Table 2.

Table 2 The top 10 significantly enriched KEGG pathways (according to P-value)

Construction of PPI Network and Identification and Validation of Hub Genes

A PPI network was constructed in the STRING website, which consists of 82 nodes and 213 edges. By visualizing and screening hub genes with Cytoscape software, a PPI network and top 12 genes were obtained and presented in Fig. 3A, B. The top 12 genes were: ISG15, MX1, IFIT1, IFIT2, RSAD2, OAS2, IFIT3, MX2, OAS1, OAS3, IFI6 and DDX58, which were confirmed as statistically significant verified by the GSE151879 dataset (Fig. 4A). The relative expression levels of 10 hub genes, namely ISG15, MX1, IFIT1, RSAD2, OAS2, IFIT3, MX2, OAS1, OAS3 and IFI6, were also consistent with single-cell RNA sequencing of PBMCs from the GSE150728 dataset, while DDX58 and IFIT2 showed no statistically significant difference (Fig. 4B).

Fig. 3
figure 3

The PPI network of DEGs and the top 12 hub genes of PPI network

Fig. 4
figure 4

A Verification of hub genes with GSE151879 dataset. P-value < 0.05 is considered to be statistically significant. B Verification of hub genes with GSE150728 dataset. P-value < 0.05 is considered to be statistically significant

Construction of gene and miRNA Interactions network

After network visualization, Gene-miRNA interactions were presented in Fig. 5. According to the criterion of the degree, the miRNA-hsa-mir-335-5p was finally screened out.

Fig. 5
figure 5

The gene-miRNA network of DEGs based on miRTarBase v8.0 database

Discussion

In this study, the gene expression profile of the GSE150392 dataset was downloaded from the GEO database, and 707 DEGs were identified, including 516 up-regulated DEGs and 191 down-regulated DEGs. In addition, the GO enrichment analysis and KEGG pathway analysis were applied to analyze the functional enrichment of DEGs. A PPI network was applied to identify the hub genes that might play a significant role in the infection of cardiomyocytes by SARS-CoV-2.

As presented in the figure that described the results of GO enrichment analysis of DEGs, it is believed that the smaller the P-value is, the more significant the extent of enrichment in the GO analysis is. As for biological processes (BP), DEGs are most significantly enriched in response to type I interferon, type I interferon signaling pathway and cellular response to type I interferon. Type I interferon has a broad-spectrum antiviral feature against RNA viruses, which may lead to innate and adaptive immune response to function against viruses. Besides, IFN-α promotes the expression of ACE2 that is strongly expressed in human hearts. The RBD of SARS-CoV-2 binding ACE2 leads to infection, while SARS-CoV-2 can employ species-specific interferon to drive up-regulation of ACE2 to promote infection (Chen et al. 2020; Ziegler et al. 2020; Mantlo et al. 2020). In terms of cellular components (CC), sarcomere and contractile fiber play a role of significance in the GO enrichment. The direct damage of cardiomyocytes, cardiac inflammation, immune response, myocardial interstitial fibrosis, and hypoxia caused by SARS-CoV-2 might act on sarcomere and contractile fiber probably, all of which may result in functional changes in the sarcomere and contractile fiber, and gradually evolve into myocardial injury (Babapoor-Farrokhran et al. 2020). In the respect of molecular function (MF), receptor ligand activity and receptor regulator activity are most significantly enriched in DEGs. The ACE2 appears to be the crucial functional receptor of SARS-CoV-2 (Coutard et al. 2020; Walls et al. 2020). Interaction between viral protein ligand and cell receptor of SARS-CoV-2, a critical step in replication of viruses, is also a crucial process of virus infection, which might be determined by the type of chemical interactions, activity between receptor and ligand, and the regulator activity (Ortega et al. 2020).

The KEGG pathway enrichment analysis discovered that quite a few significantly enriched KEGG pathways rated by P-value, such as TNF signaling pathway, NF-kappa B (NF-κB) signaling pathway, and Legionellosis, are highly relevant to the infection of SARS-CoV-2. As for the TNF signaling pathway, TNF is the crucial inflammatory cytokine resulting in various acute and chronic inflammations (Soy et al. 2020). TNF-α level was observed much higher in serum of patients with COVID-19, implying that it is positively correlated with the severity of disease (Soy et al. 2020). With the progress of SARS-CoV-2 infection, monocytes and macrophages increased, resulting in cytokine storm, which meant the release of cytokine and pro-inflammatory cytokines, including IL-1, IL-8, and TNF (Soy et al. 2020; Runfeng et al. 2020). The inflammatory response triggered by the SARS-CoV-2 in myocardium drives an up-regulation of TNF signaling pathway. In terms of the following pathway of NF-kappa B (NF-κB), it is involved in inflammation, innate and adaptive immune response, pathological development of tumor or cancer, and even the critical regulators of stress responses, apoptosis and differentiation (Baldwin 2001; DiDonato et al. 2012; Oeckinghaus et al. 2011). SARS-CoV-2 triggered the stimulation of humoral and cellular immune response as well as mitogen-activated protein kinase (MAPK) pathway and NF-kappa-B signaling pathway, and gradually caused deregulation of immune system and up-regulation of inflammatory pathways, especially NF-kappa B (Mozafari et al. 2020; Wu and Yang 2020; Li et al. 2020). The higher expression of NF-kappa B conducted the induction of the naïve T cell activation and proliferation, promoted the expression of chemokine expression and other immune response, and finally resulted in inflammatory cytokine storm in myocardium (Mozafari et al. 2020; Hoesel and Schmid 2013).

The 12 hub genes of cardiomyocytes specifically infected by SARS-COV-2 were identified, namely ISG15, MX1, IFIT1, IFIT2, RSAD2, OAS2, IFIT3, MX2, OAS1, OAS3, IFI6 and DDX58, which may become the biomarkers. Among the 12 genes, Anna et al. (Rahnefeld et al. 2014) found that ISG15, as a part of innate immunity of cardiomyocytes, plays a significant role in the inhibition of viral replication in mouse cardiomyocytes infected by Coxsackievirus B3 (CVB3). Besides, SARS-CoV-2 can induce a large number of ISG expressions, either as effectors, regulators or both, which is the antiviral response of the body (Clemente et al. 2020). It is suggested that ISG15 may play a crucial role in cardiomyocytes infected by SARS-CoV-2. As for another hub gene (MX1), Eshwar et al. have shown that the β4 subunit of Cav1.2 channels can promote the expression of IFN-related genes (including MX1) in cardiomyocytes, thus alleviating viral infection (Tammineni et al. 2018). IFIT1, IFIT2, IFIT3 and IFIT5 constitute the human IFIT gene family, which expressed very low in most cell types but can be significantly increased by interferon therapy, viral infection and pathogen-related molecular pattern (PAMP) (Pidugu et al. 2019). SARS-CoV-2 can induce strong antiviral response by up-regulating antiviral factors such as OAS1-3, IFIT1-3 and CXCL9 / 10 / 11 of Th1 chemokine, and reduce ribosomal protein transcription (Lieberman et al. 2020).

Hsa-miR-335-5p regulates significant transcription factors and participates in activating the pathways affiliated with inflammation and angiogenesis. It may activate WNT and TGF-β signaling pathways, both of which exert pleiotropic and multifunctional effects, regulating a considerable number of biological processes, TGF-β and WNT signaling pathways (Kay et al. 2019; Esquinas et al. 2017). In the pathogenesis of cardiac remodeling and fibrosis, TGF-β plays an important role (Dobaczewski et al. 2011).

The gene expression profiles of normal human iPSC cardiomyocyte samples and human iPSC cardiomyocyte samples infected by SARS-CoV-2 were studied through bioinformatics analysis, and finally, the hub genes of human iPSC-cardiomyocytes after SARS-CoV-2 infection were identified. Although sequencing data from other datasets were used for verification, further experimental research analysis (such as qRT-PCR) still need to be conducted to acquire abundant data to verify the predicted results obtained from the bioinformatics analysis. There are still some limitations to this study. It failed to understand clearly the mechanism of several dominant genes being involved in the infection of cardiomyocytes by SARS-CoV-2. The other limitation is that the small sample size and data sets of this study would affect the accuracy of the conclusions. Further research is entailed to demonstrate the molecular mechanism of these biological events involved in the infection of cardiomyocytes by the novel coronavirus.

Conclusion

In conclusion, the RNA-Seq data of human iPSC-cardiomyocytes infected by SARS-CoV-2 was studied through bioinformatics analysis. Finally, the hub genes associated with SARS-CoV-2 infection were identified, which may reveal the mechanism of heart damage caused by SARS-CoV-2 or suggest biomarkers.