Molecular characterisation of the virus and of the entry phase into the host cells
For each domain, the omics literature was consulted in reference to available data, concerning viral genomics, proteomics and molecular interactions with the host, in light of their possible involvement in the pathogenic mechanisms of the disease, and based on information from other Coronavirus infections. The results of such analysis are summarized in Table 1, in Additional file 1: Tables S3 B and Annex 1, and in Additional file 2: Tables S3 A. In the next paragraphs the results of such analysis are described in more details.
Table 1 Mechanisms and current evidence about fields of SARS-CoV-2 characterization and entry reported for specific omics data: a) viral genomics and proteomics; b) host-virus interactions at multi-omics levels Genome evolution and geographical distribution
In this field of investigation, the evolutionary history of SARS-CoV-2 was reconstructed, starting from phylogenetic comparison with related Betacoronaviruses [TE1-TE5].
Global Initiative on Sharing Avian Influenza Data (GISAID) classification of SARS-CoV-2 clades was reported together with their spread (Last up-dated on November 10th 2020, based on 175,000 genomes).
Genomic hotspots for mutation, drivers of evolution and correlation with COVID-19 pathogenesis
In SARS-CoV-2 genome, ten hyper-variable hotspots were identified. In the S gene, some regions presented signs of positive selection, i.e. dN/dS > 1, particularly within the receptor binding domain (RBD) in the S1 subunit, in the FURIN cleavage site and in the segment encoding the S2 and S2’ subunits. Also ORF3a, E, ORF6, ORF7a, ORF8, N, and ORF10 presented dN/dS > 1, indicating positive selection pressure. People infected with ORF7b and ORF8 SARS-CoV-2 deleted variants had a lower odd of developing hypoxia.
Integrated comparative genomics and machine learning techniques identified 11 regions in SARS-CoV-2 genome, reliably predictive of high fatality rate [TE6-TE18].
Intra-host genomic variability
In SARS-CoV-2 infected host, the virus displayed small scale intra-host variation, while spatial–temporal redistribution of SARS-CoV-2 “quasispecies” in respiratory and gastro-intestinal tracts in human hosts was observed, with a higher significantly genetic diversity observed in gastrointestinal compared to respiratory tract samples [TE19-TE22].
Single viral protein and whole viral proteome studies
This investigation field concerned omics studies about both single proteins and whole SARS-CoV-2 proteome characterization. Some of these studies were in silico and organized in a modular hierarchical scale of virus/host PPIs, allowing to build a dynamic and integrated structure, named “SARS-CoV-2 dynamicome” [TE24-TE29].
Immune proteomics
The host–pathogen molecular mimicry was investigated on the basis of viral proteomics, also used in studies aimed at developing innovative anti-COVID-19 vaccines [TE30-TE32].
Viral RNA and host protein interactions
In silico studies predicted possible SARS-CoV-2 RNA regions of interaction with host proteins.
Experimental studies provided description of the SARS-CoV-2 RNA–protein interactome in different SARS-CoV-2 in vitro infected cell lines. Other works provided functional interrogation of the host proteins involved in such interactions, revealing that most of them regulated virus entry into host cells, protected the host from virus-induced cell death or were involved in SARS-CoV-2 pathogenicity [TE33-TE39, TE45, TE51]. Besides interaction with host proteins, some studies also identified SARS-CoV-2 genomic regions to be potential silencer RNA (siRNA) targets, or could interact with host microRNAs (miRNAs) [TE32].
Virus-host protein–protein interactions (PPIs)
Several research studies addressed PPIs between single betacoronaviruses and host proteins or further detailed, by in silico analysis, already described interactomes [T40-TE44].
Multilayer analysis of virus-host interactions (transcriptomics, proteomics)
By integrating viral-host transcriptomics and proteomics in a multilayer analysis it was possible to characterize COVID-19 phenotypes [TE45]. Omics studies, particularly relevant for pathogenesis investigations, were based on ex-vivo studies, with clinical samples derived from SARS-CoV-2 infected subjects displaying different clinical phenotypes. In fact, some of these investigations were performed by multi-omics network-biology-fueled approach to provide the principal host components affected by SARS-CoV-2 infection. They pose the basis for the constructions of a COVID 19 Disease Map. Additional file 1: Table S2 A provides a more detailed description of the above reported information [TE48, TE49, TE50, TE52, TE53].
Viral entry
Expression of host entry factors in human tissues Expression of ACE2 and TMPRSS2 in human tissues was addressed by omics approaches in several studies [TE55, TE56, TE57, TE59, TE60, TE63]. Importantly, very low or absent ACE2 expression was reported in organs/tissues considered as the main target for SARS-CoV-2 replication, including lung, bronchus, and nasal mucosa, suggesting a dynamic regulation of entry factors upon infection and a role for possible alternative receptors.
SARS-CoV-2 interaction with entry factors Two studies explored the cross-talk between SARS-CoV-2 and host proteins during the entry and subsequent steps of viral replication. In the first paper proteins on the host cell membrane (ATP6V1A, AP3B1, STOM, and ZDHHC5) were identified that may enable binding to SARS-CoV-2 structural proteins. On other hand, several miRNAs were also identified to inhibit proteins involved in viral entry [TE73]. The second study proposed three interactomes from probabilistic modelling, using iDREM (interactive Dynamic Regulatory Events Miner): the first one, involved in creating a suitable environment for the virus, includes ATP6V1A; the second one includes PHB as alternative receptor or co-receptor; the third one, involved in sustaining viral replication, includes oxidative stress and inflammation proteins [TE48].
Regarding the viral variants of concern (VOC), B.1.351, P1 and some B.1.1.7 strains harbor, among others, mutations potentially important for pathogenesis, such as K417T/N, E484K, and N501Y. These substitutions seem to alter the interaction of S glycoprotein with ACE2 receptor, leading to increased transmissibility compared to the previous circulating strains, and especially for B.1.351 and P1 lineages, leading to reduced susceptibility to neutralizing antibodies elicited by non-variant strains and by current vaccines. In fact, viral evolution studies show that the RBD of S glycoprotein is highly variable and that immune escape mutations (i.e., E484K) may emerge independently, in multiple lineages, spreading worldwide, and leading to the further accumulation of additional changes, that may increase the risk of significant reduction of host immunity and/or of monoclonal antibody therapy, as well as of vaccine efficacy [TE69, TE70, TE71, TE72].
Pathways
To hierarchically evaluate cellular processes involved in SARS-CoV-2 infection, we assigned a univocal Reactome Code to each cellular mechanism reported in the scanned literature for the pathway analysis. Of note, this approach was performed for proteomics, transcriptomics and bioinformatics studies in vitro, where the study model was represented by cell lines infected with SARS-CoV-2 (Table 2 and in Additional file 1: Tables S4 and Annex 2). The occurrence of each mechanism was organized on the basis of Reactome Pathways (Additional file 3: Table S6).
Table 2 Immune response to SARS-CoV-2 infection in lung and other tissues (A), peripheral blood (B) and specific cell types among blood immune cells (C). Pathways, host signature and body districts, subset per specific omics data: host proteomics, bulk and single cell RNAseq (scRNAseq) This approach allowed us to highlight several mechanisms altered by SARS-CoV-2 infection, i.e. signal transduction (R-HSA-162582), translation (R-HSA-72766), post-translation protein modifications (R-HSA-597592), immune response (R-HSA-168256), cell cycle (R-HSA-1640170), apoptosis (R-HSA-109581), autophagy (R-HSA-9612973), lipid metabolism (R-HSA-556833) and vesicle-mediated transport (R-HSA-5653656).
Figure 3 shows how omics data, organized by omics technique and tissue (Proteomics, Metabolomics and Transcriptomics/CyTOF, in A, B and C respectively), contribute to highlight pathways up- or down-regulated in COVID-19 patients compared to healthy subjects, or in severe COVID-19 patients compared to mild patients.
With respect to signal transduction, mTOR pathway is strongly affected during SARS-CoV-2 infection, leading to the activation of PI3K/AKT and TNF cascades [13] [TE74, TE75, TE76, TE77, TE78]. Moreover, considering that the production of viral proteins depends on host cap-dependent translation [14], this mechanism is also altered by SARS-CoV-2 infection. While on one hand the virus usurps cellular translation machinery to promote its own reproduction, on the other hand cells attempt to reduce translation to contrast the infection. It has been extensively reported that several SARS-CoV-2 proteins bind E3-ubiquitin ligases, usurping cellular degradation machinery, thus promoting virus replication [5, 6] [TE40, TE41,TE49, TE80, TE83]. Other pathways extensively addressed in COVID-19 studies are represented by both innate and adaptive immunity. Specifically, IFN, TNF and NF-kB pathways, cytokine SPP1, GRN, the receptor tyrosine kinase AXL [TE85], NLR and RIG-I signaling are strongly altered by SARS-CoV-2 [TE76], contributing to severe forms of COVID-19 [6]. Similarly, cell cycle is affected by SARS-CoV-2 infection with a rapid reshape of several host mechanisms, leading to cell cycle arrest. However, it is important to consider that most of the reported studies have been performed in cell cultures in vitro, which represent an undisputed model for infection; nevertheless, cell cycle is intrinsically altered in the in vitro culture systems, introducing a bias in COVID-19 model conceptualization. Several network analyses of cellular pathways related to SARS-CoV-2 infection both in vitro [TE44, TE74, TE77, TE78, TE80] and in silico [TE73], show that cell death, in particular apoptosis, is affected. In addition, Stuckalov and colleagues observed an accumulation of several autophagic proteins (e.g. SQSTM1, GABARAPL2, NBR1, CALCOCO2, MAP1LC3B, TAX1BP1) following ORF3 expression, also observed in cells infected by the virus (SQSTM1, MAP1LC3B), suggesting an inhibition of the autophagic pathway during SARS-CoV-2 infection [TE81]. The metabolism of lipids is also affected, regarding the host protein SIGMAR1 involved in lipid metabolism and ER stress response proteins involved in binding nsp6 and ORF9c of SARS-CoV-2 [6] [TE77, TE83]. Finally, it is emerging that different coronaviruses hijack specific RAB GTPases during the infectious cycle. In fact, host RAB2A and RAB7A are critical for HCoV-229E, HCoV-OC43 and HCoV-NL63 infection, while RAB10 and RAB14 play an important role in SARS-CoV-2 infection [TE45].
Host signatures
The host signature was addressed in studies evaluating the systemic profile of soluble mediators by proteomics and metabolomics approaches. Cellular immune response to SARS-CoV-2 infection was evaluated by transcriptomics, proteomics, and high dimensional single cell analysis (mass cytometry, CyTOF) in peripheral blood and tissues, reported in Table 2, and detailed in Additional file 1: Tables S4 and Annex 3.
Systemic profile of soluble mediators
Proteomic studies
During the early immune response, the activation of type I/III IFN response represents a key local innate immune player and is associated with chemokines [TE87] and proinflammatory mediators [TE88], leading up to a massive release of inflammatory mediators, called “cytokine storm”. In particular, mild COVID-19 is characterized by increase of IFN-α, IFN-γ, IL6, IL10, IL1RA [TE87, TE88, TE89], while severe disease is characterized by a downregulation of IFNα, and a parallel up-regulation of IFN-β, IFN-γ, IL6, TNF, IL10, CCL7 [TE89, TE90, TE94]. Finally, critical disease is characterized by the upregulation of IL6, TNFs, IL10, RIPK3 [TE89]. Several proteins linked to IL6-mediated proinflammatory cytokine signaling are strongly expressed in severe COVID-19 (Additional file 1: Annex 3A.1). The acute phase proteins (APPs) are an additional class of mediators involved in the early phase immune response in COVID-19. These proteins are up-regulated in the severe forms of COVID-19, and can induce inflammatory cytokines, influence lipid metabolism, and induce neutrophil activation, as shown for S100A8 and S100A9 [15], thus possibly contributing to amplify the cytokine storm. Moreover, APPs can also modulate platelet aggregation and activation of coagulation cascade [16, 17] which are closely related with the severity of COVID-19. Differently, in early COVID-19 infection, down-regulation of complement and coagulation cascades has been observed (C1R, C7) compared to influenza infection [TE88]. (Additional file 1: Annex 3A.2).
Several enzymes with antimicrobial activity are increased in COVID-19 patients’ sera, including CST3, DEFA1, and LYZC, indicating a possible secondary bacterial infection [TE97].
Metabolomic studies
Levels of most lipid-related molecules are altered in moderate and severe COVID-19 patients with a clear preference toward their downregulation. Main observed alterations are related to lipoproteins, cholesterol, triglycerides, glycerophospholipids and sphingolipids. Plasma lipid perturbations in COVID-19 patients are consistent with alterations in liver lipoprotein metabolism and changes in circulating exosome contents (Additional file 1: Annex 3A.3).
Metabolites of the tricarboxylic acid cycle (TCA) and β-oxidation are reduced in COVID-19, particularly in severe patients, whereas metabolic intermediates of the glycolysis and pentose phosphate pathways are increased [TE90, TE96, TE100, TE101, TE102, TE103]. This reduction may be the consequence of declined lung functions and blood oxygen level decrease, but may also mirror a response to nutritional changes, especially in severe patients (Additional file 1: Annex 3A.4).
In the serum of COVID-19 patients, amino acids and their derivatives result significantly decreased, especially the gluconeogenic and sulphur-containing ones [TE101, TE102, TE103, TE104]. Moreover, compounds of arginine metabolism, including urea cycle metabolic intermediates and arginine derivatives, dropped down in the serum of COVID-19 patients [TE103, TE104]. The observed changes in amino acid levels could be indicative of liver dysfunction. In addition, reduced circulating tryptophan levels are observed in COVID-19 patients, associated to elevated kynurenine levels [TE94, TE104, TE146]. Kynurenine/tryptophan levels are a general measure of indole 2,3-dioxygenase (IDO) activity. Such activity is induced by IFN-γ, in response to viral infections and plays an immunoregulatory role by limiting inflammation [18] (Additional file 1: Annex 3A.5).
Transcriptomics/CyTOF studies
Immune response in peripheral blood (BULK RNAseq)
Early SARS-CoV-2 infection triggers a powerful, IFN-driven transcriptional response in peripheral blood. Type I IFN response was impaired in severe and critical COVID-19 patients: striking downregulation of ISGs, IRF-1 and STAT3, absence of circulating IFN-β in patients with all disease-severity grades and low IFN-α production in severe COVID-19 patients were reported [TE90]. Moreover, elevated levels of chemokines and chemokine receptors were detected in severe patients, exhibiting an increase in neutrophils. Downregulation of negative regulators of innate immune system and TCR signaling kinases and adaptors was observed in severe patients (Additional file 1: Annex 3B).
Innate immune cell compartment (scRNAseq/CyTOF)
The initial local respiratory SARS-CoV-2 infection elicits dynamic changes of circulating blood cells with changes in innate immunity parameters. An elevated neutrophil/lymphocyte ratio has been identified as a sign of COVID-19 severity [TE89]. Investigation of the neutrophil transcriptomics signatures highlighted that excessive neutrophil activation is associated with severe COVID-19 more frequently than with mild disease. Moreover, Low Density Neutrophils with immature phenotype were up-regulated in severe disease [TE96, TE112]. An increase of classical CD14+ monocytes, especially in convalescence stages, non-classical CD16+ monocytes and natural killer (NK) cells also with exhaustion phenotype was observed [TE89, TE105, TE112, TE113]. Impaired IFN-α production by plasmacytoid dendritic cells was also observed in COVID-19 patients. However, some ISGs were up-regulated in monocytes and DC [TE89] (Additional file 1: Annex 3C).
Adaptive immune cell compartment (scRNAseq/CyTOF)
SARS-CoV-2 infection has a strong effect on the transcriptional profile of T and B cells. scRNAseq confirmed the drop in the percentage of circulating lymphocytes, including CD4+, and CD8+ T cells relative to increasing severity [TE89, TE112, TE113]. The increased expression of genes related to T cell apoptosis in COVID-19 patients may contribute to circulating lymphocyte depletion [TE83]. The upregulation of genes related to a strong T-cell response characterizes the immune response against SARS-CoV-2 in mild patients, reflecting T-cell signalling activation and T-cell differentiation, followed by rapid reduction thereafter. Differently, a negative T-cell signalling was persistently observed in severe patients along time. The transcriptional signatures of different T cells profiles types show that severe patients present an increase of naive T cells and a decrease in activated effector T cells as compared to mildly affected patients. Moderate COVID-19 patients present a proliferative exhausted CD8+ T cell subpopulation. This cell population has high cytotoxic signature, maintaining its naïve character. IL17-A and IL17-F are increased in COVID-19 patients [TE90], and genes related to IL17 signalling are significantly enriched in the severe-fatal group [TE96]. Finally, a significant activation of naïve B cells and expansion of antibody-secreting cells (ASCs) has been observed both in moderate and in severe patients, as compared to healthy and mild/asymptomatically infected subjects. (Annex 3D).
Immune response in lung and other tissues
Major deregulation of the innate immune response has been observed in lung samples. COVID-19 patient lungs showed a compartmentalization of innate immune cells (neutrophils and monocytes), in response to the chemokine secretion [TE106, TE108]. The transcriptional profiling of the lung tissue showed the over-expression of genes related to neutrophil activation and the generation of neutrophil extracellular traps (NETs), confirming the role of NETs formation in the immunopathology induced by SARS-CoV-2 infection.
Transcriptional profiling of nasopharyngeal swabs showed inflammatory response genes, IFN-α response, IL6/JAK/STAT3 signalling, and complement cascade activation in COVID-19 patients, while a down-modulation of anti-inflammatory pathways was observed by scRNAseq in CD14+ /CD16+ cells from severe patients [TE99].
The colon transcriptome of COVID-19 patients revealed an up-regulation of genes related to the response to TGF-β, whereas a down-modulation of genes involved in immune cell activation was found in fatal cases compared to healthy donors [TE110].
Proteomics analysis revealed the up-regulation of iNOS and IL1b and IL6 proteins in lung tissue of COVID-19 patients [TE113]. Moreover, proteomic analyses identified many key proteins, such as cathepsins B and L, and inflammatory response modulators, highly expressed and translated in fatal cases compared to healthy donors [TE110] (Additional file 1: Annex 3E).
Phenotypes
We selected the studies with clear stratification of patients by disease severity; multi-omics data from these studies which highlighted significant differences between healthy controls and COVID-19 patients, and between mild and severe clinical presentation. We then classified the selected studies in four topics: (1) Key Genes and Proteins in SARS-CoV-2- host interactions and pathogenesis in the Lung; (2) DEG and DEP analysis in other organs and tissues; (3) Hub genes and pathways of innate immune response; (4) Comorbidities, further subclassified in comorbidities COVID-19-associated not sharing COVID-19 pathogenesis, and comorbidities associated and related to COVID-19 pathways, reported in Table 3, Additional file 1: Tables S5 and Annex 4.
Table 3 Pathogenic mechanisms in COVID-19 phenotype: SARS-CoV-2—host interactions in the lung. (A), DEG and DEP analysis in other organs and tissues (B) Hub genes and pathway of innate immune response (C), Comorbidities COVID19 associated not sharing COVID19 pathogenesis (D), Comorbidities associated and related to COVID-19 pathway (E) As shown in Table 3A, our review confirms that SARS-CoV-2 RNA is highly localized in cells that express TMPRSS2, especially ciliated and secretory cells in the airway epithelium, and Alveolar Type 1 (AT1) cells in the lung [TE115]. As reported before, transcriptomics and proteomics analysis show the pathways more involved in patients with severe disease [TE107, E108, TE110, TE116, TE117]. Both mild and severe COVID-19 patients present elevation of chemokines associated with lung inflammatory disorders, such as acute respiratory distress syndrome, asthma, and pulmonary fibrosis [TE88].
Genomics highlights that chromosome 3 is significantly associated with respiratory failure, since in its loci genes functionally interacting with ACE2 are located [TE130].
In Table 3B, pathways most involved in severe COVID- 19, highighted by transcriptomics and proteomics data on gastrointestinal, genital and neurologic departments, are reported. The potential susceptibility of these tissues to SARS-CoV-2 entry is due to the high co-expression of ACE2 and TMPRSS2 [TE122- TE127]. However, in these organs SARS-CoV-2 does not determine the damage observed in the respiratory tract, suggesting that ACE2/TMPRSS2 expression alone is not sufficient to mediate the tissue injury.
As shown in Table 3C, omics data highlight a dysregulation of innate immune response-related pathways in severe patients [TE131, TE132]. Both mild and severe patients present a significant downregulation of TCA and of glycolytic pathways, and upregulation of HIF1A signalling and host defense pathways. Proteome host signatures indicate high specificity of several inflammatory modulators, particularly IL6, IL1B, and TNF [TE138], as confirmed by the significant increase of these modulators observed in severe patients in clinical studies [TE96]. Immunosuppression and tight junction impairment occurs in the early phase of COVID-19, while the immune response is activated later [TE135]. The defective monocyte activation, combined with the dysregulated myelopoiesis, observed in patients with severe disease, may cause a continuous state of inflammation and ineffective host immune response [TE112].
The inflammation described in severe COVID-19 is also reflected by metabolomics and lipidomics data, which show imbalanced homeostasis of glycolysis, lipogenesis, heme and ketone biosynthesis, gluconeogenesis, fatty acids oxidation, and cholesterol biosynthesis, through the activation of β-oxidation pathways [TE109]. Metabolomics and lipidomics characterize the difference between mild and severe forms in quantitative terms, and are mostly found in COVID-19 phenotypes associated with comorbidities. The strong association between inflammation and metabolic alterations allows to identify two groups of comorbidities: (1) COVID-19-associated diseases that increase patient frailty by COVID-19-independent pathogenic mechanisms (e.g., chronic heart disease); (2) COVID-19-associated diseases that increase patient frailty by COVID-19-dependent pathogenic mechanisms (e.g., diabetes) (Table 3D, E).
Genomics studies suggest that ACE2 polymorphisms might be associated with cardiovascular and pulmonary conditions by altering the AGT-ACE2 interactions, and transcriptomics data confirm the upregulation of the gene encoding ACE2 receptor in lung tissue in several comorbidities associated with severe COVID-19, such as COPD or PAH, and even in people who smoke.
Diabetes is the best described co-morbidity related to COVID-19 pathways. This complex metabolic disease is able to complicate COVID-19 by several mechanisms: (1) presence of bone marrow changes, predisposing to excessive proinflammatory response and contributing to insulin resistance, reducing vascular repair and worsening function of heart, kidney, and systemic vasculature; (2) increased circulating FURIN levels, that cleaves the S glycoprotein; (3) dysregulated autophagy, that may promote replication and/or reduce viral clearance; (4) gut dysbiosis, leading to widespread systemic inflammation, increased glucose and sodium absorption, and reduced absorption of tryptophan needed for glucose homeostasis [TE142].
COVID-19 patients showed relevant changes in serum levels of lipoprotein subclasses and their components [TE94, TE100, TE102], mainly reflecting the metabolic pathways of lysine degradation, metabolism of taurine, hypotaurine, alphalinolenic acid, glycerophospholipid, arginine, proline, and arginine biosynthesis [TE145]. Severe patients are characterized by pathways listed in Table 3E, possibly linked to a reduced hepatic capacity to oxidize acetyl-CoA in the mitochondria, consistent with serum glucose elevation [TE104].
The technique of transcriptomics per single cell (scRNAseq) contributed also to better understand the etiology of COVID-19 neurological sequelae, although further analyses are needed [TE124].