1 Introduction

Coronaviruses (CoV) belong to the Coronavirinae subfamily that forms along with the subfamily Torovirinae the virus family Coronaviridae within the order Nidovirales. The Coronavirinae subfamily harbors four genera (Fig. 4.1): Alpha-, Beta-, Gamma-, and Deltacoronavirus (Adams and Carstens 2012; Woo et al. 2012). Coronaviruses are enveloped viruses that contain a single-stranded RNA genome of positive polarity comprising roughly 30 kilobases. The virus particles are spherical and with a diameter of 80–120 nm (Belouzard et al. 2012). They contain the genome , which is associated with the nucleoprotein (NP), forming a ribonucleoprotein complex (RNP) (Belouzard et al. 2012). Depending on the virus, three or four viral proteins are embedded in the viral envelope: Membrane protein (M), envelope protein (E), and spike glycoprotein (S) are present in all coronaviruses, while some members of the genus Betacoronavirus additionally contain a hemagglutinin-esterase protein (HE). M and E are required for viral assembly (Belouzard et al. 2012), HE promotes release of viruses from infected cells (Vlasak et al. 1988), and the S protein, which is in the focus of this review, facilitates viral entry into target cells. The S protein is also responsible for the corona-like shape of these viruses in electron micrographs, on the basis of which the name coronavirus was coined (Berry and Almeida 1968; Du et al. 2009).

Fig. 4.1
figure 1

Phylogenetic relationship among coronaviruses based on their spike glycoproteins. The amino acid sequences of coronavirus spike glycoproteins representing all four genera (Alpha-, α; Beta-, β; Gamma-, γ; Deltacoronavirus, δ) within the Coronavirinae subfamily were aligned and utilized to generate a phylogenetic tree (neighbor-joining method). Italicized numbers at the nodes indicate bootstrap values

Coronaviruses infect a broad range of vertebrate hosts with alpha- and betacoronaviruses targeting different mammals, while gamma- and deltacoronaviruses mainly infect birds (Breslin et al. 1999; Cavanagh et al. 2001; Jonassen et al. 2005). It is believed that coronaviruses of the genera Alpha- and Betacoronavirus have emerged from bats, while gamma- and deltacoronaviruses seem to originate from birds (Graham and Baric 2010; Woo et al. 2012). Coronavirus infection is mainly associated with respiratory and enteric diseases but, depending on the virus, can also lead to hepatic (Lane and Hosking 2010) and neurologic manifestations (Foley and Leutenegger 2001).

Human coronaviruses (HCoVs) are known since 1965 when they were identified in patients suffering from the common cold (Tyrrell and Bynoe 1965). Most of HCoVs known today (HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1) infect ciliated epithelia cells of the nasopharynx (Afzelius 1994; Weiss and Navas-Martin 2005) and cause self-limiting upper respiratory tract diseases in immunocompetent individuals, with symptoms like headache, sore throat, and malaise being frequently observed. In rare events, infection can spread to the lower respiratory tract, causing bronchiolitis, bronchitis, and pneumonia, particularly in infants, the elderly, and immunocompromised individuals (Masters and Perlman 2013).

Within the last 20 years, two novel HCoVs emerged that cause severe and frequently fatal infections in humans (Drosten et al. 2003; Lu et al. 2015; Reusken et al. 2016; Zaki et al. 2012). In 2002, the outbreak of severe acute respiratory syndrome coronavirus (SARS-CoV ) in Southern China and its subsequent worldwide spread was associated with roughly 8100 infections of which 10% took a fatal course, with the elderly being mainly affected (Peiris et al. 2003). In the aftermath of the SARS pandemic, it has been revealed that bats harbor numerous SARS-CoV-related viruses as well as other coronaviruses that may be zoonotically transmitted to humans via intermediate hosts (Hu et al. 2015; Lu et al. 2015). In 2012, the Middle East respiratory syndrome coronavirus (MERS-CoV ), another novel, highly pathogenic coronavirus emerged in Saudi Arabia, causing a SARS-like disease (Zaki et al. 2012). MERS-CoV infection is associated with a case-fatality rate of 35% (WHO Health Organisation 2017), and comorbidities like diabetes mellitus, chronic renal disease, and hypertension constitute major risk factors for a lethal outcome of the disease (Assiri et al. 2013). Like SARS-CoV, MERS-CoV is a zoonotic virus originating from an animal reservoir, dromedary camels (Mohd et al. 2016). As the MERS epidemic is still ongoing, there are concerns that human-to-human transmission, which is very infrequent at present (Alsolamy and Arabi 2015), might become more efficient due to adaptive mutations in the viral genome (Dudas and Rambaut 2016; Reusken et al. 2016).

Coronaviruses also constitute a severe threat to animal health. For instance, porcine epidemic diarrhea coronavirus (PEDV) infects the epithelia of the small intestine and causes villous atrophy, resulting in diarrhea and severe dehydration (Debouck and Pensaert 1980; Jung et al. 2006). The virus was first described in Europe in the 1970s and was originally not perceived as a major threat to animal health (Debouck and Pensaert 1980; Pensaert and de 1978). Recently, however, highly virulent PEDV strains emerged that cause lethal infection in 80–100% of piglets and weight loss in adult pigs (Debouck and Pensaert 1980; Lee 2015). PEDV spread can have severe consequences: The introduction of PEDV in the USA resulted in major economic losses among pig farmers and a 10% decline in the American pig population (Lee 2015; Li et al. 2012; Liu et al. 2016; Stevenson et al. 2013). As there are no effective vaccines or specific treatments available, current containment strategies are mainly limited to rigorous disinfection routines.

Coronaviruses constitute a severe threat to animal and human health, as discussed above, and the development of antivirals is an important task. Host cell factors required for coronavirus spread but dispensable for cellular survival are attractive targets, since their blockade might suppress infection by several coronaviruses and might be associated with a high barrier against resistance development. The viral S protein mediates the first step in coronavirus spread, viral entry into target cells. However, the S protein is synthesized as an inactive precursor and requires cleavage by host cell proteases for conversion into an active form. The cellular enzymes responsible constitute targets for antiviral intervention, and recent studies provided important insights into their identity, expression, and target sites in the viral S protein. Moreover, novel mechanisms governing protease choice by coronaviruses have been uncovered. The present manuscript will review and discuss these findings, focusing on SARS-CoV and MERS-CoV.

2 The Coronavirus Spike Protein: Viral Key for Entry into the Target Cell

Domain organization . The S protein of coronaviruses contains an N-terminal signal peptide which primes the nascent polyprotein for import into the ER. In the ER, the S protein is extensively modified with N-linked glycans, which may provide protection against neutralizing antibodies (Walls et al. 2016b). After passing the quality control mechanisms of the ER, the S protein is transported to the site of viral budding, the endoplasmic reticulum/Golgi intermediate compartment (ERGIC). Homotrimers of the S protein, for which atomic structures have recently been reported (Kirchdoerfer et al. 2016; Walls et al. 2016a), are incorporated into the viral membrane and mediate viral entry into target cells. For this, the S protein combines two biological functions: First, its surface unit, S1, binds to a specific receptor located at the surface of host cells and thereby determines cellular tropism and, as a consequence, v iral pathogenesis. Second, the transmembrane unit, S2, mediates fusion between the viral envelope and a target cell membrane (Fig. 4.2).

Fig. 4.2
figure 2

Domain organization and structure of the coronavirus spike glycoprotein. (a) Schematic illustration of a coronavirus spike (S) glycoprotein consisting of the subdomains S1 and S2. At the N-terminus of the S1 subdomain resides the signal peptide that allows for introduction of nascent S proteins into the host cells’ secretory pathway. Additionally, this subdomain harbors amino acid residues responsible for virus attachment to target cells (receptor-binding domain, RBD). The S2 subdomain contains the structural components of the membrane fusion machinery (fusion peptide, heptad repeats (HR) 1 and 2), anchors the S protein in the lipid envelope via the transmembrane domain, and interacts with the viral ribonucleoprotein complex through its endodomain. Location of the S1/S2 border and the S2′ position is indicated by black triangles. (b) 3D-model of trimeric SARS-CoV S protein (amino acid residues 261–1058) schematically positioned on the outside of the viral envelope. The protein structure ID, 5WRG, (Gui et al. 2017) was downloaded from the RCSB Protein Data Bank and analyzed using the YASARA software (www.yasara.org, Krieger and Vriend 2014). Each S protein monomer is colored individually, and the position of the RBD is indicated. Further, the locations of the arginines at the S1/S2 border (R667) and S2′ position (R797) are highlighted

Cellular receptors . Coronaviruses use a broad range of receptors for entry into target cells (Table 4.1). Alphacoronaviruses like HCoV-229E, transmissible gastroenteritis coronavirus (TGEV), and porcine respiratory coronavirus (PRCV) engage the zinc metalloproteinase CD13 from their natural host as well as feline CD13 (feCD13) as entry receptor (Tresnan and Holmes 1998), with different residues in feCD13 being required for recognition by the respective coronaviral S proteins (Tusell et al. 2007). Despite high amino acid sequence similarity within the S1 subunit, the S proteins of HCoV-229E and -NL63 interact with different host cell receptors, namely, CD13 (Yeager et al. 1992) and angiotensin-converting enzyme 2 (ACE2) (Hofmann et al. 2005). Notably, ACE2 is also employed by SARS-CoV for entry (Li et al. 2003; Wang et al. 2004), although the S protein of this betacoronavirus and NL63-S share little sequence similarity. Other members of the betacoronaviruses use different entry receptors: MERS-CoV uses human dipeptidyl peptidase 4 (DPP4), mouse hepatitis virus (MHV) interacts with carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) (Dveksler et al. 1991; Williams et al. 1991), and neuraminic acid is used by bovine CoV and HCoV-OC43 for attachment to cells (Kunkel and Herrler 1993; Schultze et al. 1991). Similarly, sialic acid-containing surface molecules serve as attachment factors or receptors for TG EV, PEDV, and avian infectious bronchitis virus (IBV) (Cavanagh and Davis 1986; Deng et al. 2016; Krempl et al. 1997; Liu et al. 2015; Schultze et al. 1992).

Table 4.1 Host cell receptors of selected alpha- and betacoronaviruses

Structural insights into receptor choice . The proteolytic priming of the viral S proteins is in the center of this review. However, priming and receptor binding can be intimately connected, and structural analyses provide valuable explanations for coronavirus receptor specificity. Therefore, structural aspects of S protein binding to its receptor will be briefly discussed. Binding to a receptor is mediated by a receptor-binding domain (RBD) , which is located in the surface unit S1. The S1 subunit generally consists of an N-terminal (NTD) and a C-terminal domain (CTD) (Li 2012), which can serve as RBD either alone or in combination. For most coronavirus analyzed, the S1-NTD is responsible for binding to host cell glycans (Krempl et al. 1997; Liu et al. 2015; Peng et al. 2012; Promkuntod et al. 2014), whereas the S1-CTD targets the a proteinaceous receptor (Du et al. 2013; Godet et al. 1994; Hofmann et al. 2006; Lin et al. 2008; Liu et al. 2015; Mou et al. 2013; Wong et al. 2004). All S1-CTD investigated so far are characterized by a core domain overlaid by an external region, which directly contacts the receptor (Li 2016). The S1-CTD of SARS-S comprises a core of five ß-sheets in antiparallel orientation, headed by a rather globular external reg ion (Li et al. 2005a) in which amino acids N479 and T487 mediate high affinity binding to ACE2 (Li et al. 2005b). The S protein of SARS-CoV from palm civets, a potential intermediate host (Guan et al. 2003; Ksiazek et al. 2003; Rota et al. 2003; Song et al. 2005; Wu et al. 2005), harbors amino acids at positions 479 and 487 which preclude efficient binding to human ACE2 (Li 2008), and acquisition of mutations at these positions was sufficient for cross-species transmission during the SARS epidemic (Li 2008; Li et al. 2005b; Qu et al. 2005; Song et al. 2005; Wu et al. 2011, 2012). Within human ACE2, two lysine residues (K31 and K353) are critical for SARS-S binding (Li 2008; Wu et al. 2011, 2012), and an exchange to histidine at position 353 present in murine ACE2 renders this protein unsuitable for efficient SARS-S binding (Li et al. 2004, 2005b). Similarly, the rat homologue of ACE2 contains a glycosylated asparagine at position 82 which sterically blocks S protein interaction (Frieman et al. 2012; Li et al. 2004). These findings show that subtle variations within the S protein and its receptor can dramatically impact cross-species transmission of coronaviruses.

The core domain of the S1-CTD in MERS-S structurally resembles that of SARS-S (Chen et al. 2013; Lu et al. 2013; Wang et al. 2013; Yuan et al. 2017), but the extended core domains are different, with the MERS-S extended core consisting of antiparallel ß-sheets forming a flat s urface which targets DPP4 (Raj et al. 2013). The MERS-S binding site on DPP4 is located within a propeller-like structure conserved in bat, camel, and human DPP4 (Barlan et al. 2014; van et al. 2014), and MERS-related CoV have been isolated from both bats and camels (Alagaili et al. 2014; Annan et al. 2013; Haagmans et al. 2014; Lau et al. 2013). In contrast, rodent DPP4 homologues are nonfunctional as MERS-CoV receptors (Cockrell et al. 2014; Coleman et al. 2014; Fukuma et al. 2015; Peck et al. 2015; Raj et al. 2014), probably due to steric hindrance due to a glycosylation in rodent DPP4 (Peck et al. 2015).

In a recent publication, Yuan and colleagues analyzed trimeric MERS- and SARS-S proteins in their pre-fusion conformation using single-particle cryo-electron microscopy (Yuan et al. 2017). Their results revealed an unexpected flexibility of the respective RBDs: in the “lying state,” the RBDs are buried inside the trimer, whereas in the “standing state” the RBDs are exposed for receptor interaction (Yuan et al. 2017). Hereby, MERS-S1/S2 trimers appeared with one or two of the RBDs in the standing conformation, thus being able to contact DPP4, whereas SARS-S trimers showed two or all three RBDs in the lying state, thus being incapable of receptor binding without further conformational change. The flexibility of the RBDs might therefore alleviate receptor interaction for subsequent virus entry (Yuan et al. 2017).

Finally, it should be noted that the RBD constitutes the most important target for neutralizing antibodies (Bonavia et al. 2003; Breslin et al. 2003; Godet et al. 1994; He et al. 2004; Kubo et al. 1994). Additionally, sequence comparison of six HCoV S2 domains suggests that also the fusion peptide, the HR1 domain, and the central helix, which are exposed at the surface of the stem region of S protein trimers, can be targeted by neutralizing antibodies (Yuan et al. 2017). Therefore, the structural information discussed above not only provides insights into S protein receptor interactions but also helps to understand how they can be inhibited by antibodies (Du et al. 2008; Lan et al. 2015; Oh et al. 2014; Tai et al. 2017; Walls et al. 2016a).

Membrane fusion . The transmembrane unit S2 harbors domains required for fusion between viral and host cell membrane, including a fusion peptide and two heptad repeats (HR1 and HR2). These elements are followed by a transmembrane (TM) domain and a C-terminal intracytoplasmic tail (Fig. 4.2), which plays a role in S protein sorting. The HR domains consist of α-helices, and their position and amino acid sequences are conserved among all groups within the coronavirus family (de Groot et al. 1987). Membrane fusion commences with the insertion of the fusion peptide into the target cell membrane. Subsequently, the HR regions fold back onto each other, resulting in the formation of a thermostable six-helix bundle structure (Bosch et al. 2003; Duquerroy et al. 2005; Lu et al. 2014; White and Whittaker 2016). As a consequence, the membranes are pulled into close contact and ultimately fuse. Several unrelated viral glycoproteins exhibit the same domain organization and membrane fusion mechanism as CoV S proteins (Dimitrov 2004; White and Whittaker 2016). These proteins are collectively termed class I membrane fusion proteins and contain α-helices as the predominant structural element (Belouzard et al. 2012; Bosch et al. 2003; Tripet et al. 2004; White and Whittaker 2016). All viral class I membrane fusion proteins require a trigger to overcome the energy barrier associated with membrane fusion reaction, low pH, and/or potentially receptor binding. Moreover, viral class I membrane fusion proteins are invariably synthesized as inactiv e precursors and depend on priming by host cell proteases to transit into an active form, and the general aspects of CoV S protein priming will be discussed in the next section.

3 Proteolytic Priming of Coronavirus Spike Proteins: Basic Concepts

The proteolytic separation of the S1 and S2 subunits, termed priming, provides the CoV S protein with the structural flexibility required for the membrane fusion reaction. Initial studies, conducted with the envelope protein of human immunodeficiency virus (HIV) and the hemagglutinin of highly pathogenic avian influenza A viruses (FLUAV), indicated that cleavage occurs in the constitutive secretory pathway of infected cells and is carried out by furin or related subtilisin-like proteases (Hallenberger et al. 1992; Stieneke-Gröber et al. 1992). Moreover, cleavage was shown to occur at the border between the surface and transmembrane units of these glycoproteins (Hallenberger et al. 1992; Stieneke-Gröber et al. 1992). However, subsequent studies, many of which were conducted in recent years, showed that priming of CoV S proteins is substantially more complex and can impact the cellular localization of membrane fusion. The major advances of our understanding of S protein priming relative to early studies will be briefly outlined below and will then be discussed in detail in the context of SARS-CoV and MERS-CoV infection.

Two cleavage sites. Initial studies reported cleavage of viral glycoproteins at the border between surface and transmembrane unit, but more than one cleavage event might be required for S protein activation (Belouzard et al. 2009; Millet and Whittaker 2014). Thus, it is now appreciated that several S proteins are cleaved at the interface between the S1 and S2 subunits, termed S1/S2 site, and at a site located near the N-terminus of the fusion peptide, termed S2′ site (Fig. 4.3). The latter cleavage might be of particular importance since it generates the mature N-terminus of the fusion peptide, which is required for insertion into the target cell membrane and thus the successful execution of the membrane fusion reaction (Belouzard et al. 2009; Millet and Whittaker 2014).

Fig. 4.3
figure 3

Amino acid residues at the S1/S2 interphase and S2′ position among different coronavirus spike proteins. Partial sequence alignment of amino acid residues of coronavirus spike glycoproteins from all four genera located at sites used for S protein activation, S1/S2 border, and the S2′ position (numbers indicate the respective regions of the respective full length S proteins). Basic amino acid residues upstream of the S1/S2 border and the S2′ position are written in bold letters. Moreover, mono- and multibasic motifs suitable for host cell protease-mediated S protein activation are highlighted (gray boxes)

Multiple priming enzymes, multiple cellular locations for priming. Several enzymes, pertaining to different protease families, can be hijacked by CoV S proteins for priming. The pH-dependent cysteine protease cathep sin L, TMPRSS2, and other members of the type II transmembrane serine protease (TTSP) family as well as the serine protease furin can prime S proteins during viral entry into target cells (Bertram et al. 2012, 2013; Gierer et al. 2013; Glowacka et al. 2011; Matsuyama et al. 2010; Millet and Whittaker 2014; Shirato et al. 2013; Simmons et al. 2005). In addition, furin can cleave CoV S proteins i n infected cells (Bergeron et al. 2005; Millet and Whittaker 2015; Yamada and Liu 2009). These proteases are expressed at different sites in cells, and their intracellular localization determines the cellular location of S protein-driven membrane fusion. For instance, cathepsin L is expressed in endosomes and cleaves S proteins upon viral uptake into these vesicles (Burkard et al. 2014; Huang et al. 2006; Qiu et al. 2006; Simmons et al. 2005; White and Whittaker 2016), while TTSPs process their ligands at the cell surface and are believed to cleave S proteins at this site (Glowacka et al. 2011; Matsuyama et al. 2010; Shulla et al. 2011). Finally, S protein processing in infected cells can determine which proteases can be engaged for priming during viral entry into target cells, suggesting an intricate connection between proteolysis events (Park et al. 2016).

Link between receptor binding and priming. Receptor binding and priming are frequently viewed as separate events. For instance, the FLUAV hemagglutinin is primed by proteases in infected cells and uses sialic acid modified proteins or lipids on the surface of target cells as entry receptor (Hamilton et al. 2012). In contrast, receptor engagement and priming can be intimately connected for CoV S proteins. Thus, SARS-S on cell-free virions is inactivated by trypsin cleavage, while trypsin cleavage of virion-associated SARS-S bound to its receptor ACE2 primes the S protein for membrane fusion (Belouzard et al. 2009; Matsuyama et al. 2005; Simmons et al. 2004, 2005). Similarly, DPP4 binding of MERS-S, precleaved at the S1/S2 site, is believed to be required for subseque nt priming by TMPRSS2, as discussed above (Millet and Whittaker 2014; Park et al. 2016). On the basis of these findings, it has been postulated that receptor binding can induce conformational changes in S proteins that expose cleavage sites for priming proteases.

Priming and triggering of S proteins: Distinction without a difference? Viral glycoproteins are usually triggered by protonation and/or receptor binding, which allow the proteins to overcome the energy barrier associated w ith membrane fusion. However, neither binding to receptor nor exposure to low pH is sufficient to trigger the S proteins of MERS-CoV and SARS-CoV (Li et al. 2006; Sha et al. 2006; Simmons et al. 2004). Therefore, it is conceivable that proteolytic processing of these S proteins may suffice for triggering. In order to reflect this finding, we will replace “priming” by “activating” in the remainder of this discussion.

4 Proteolytic Activation of the Spike Proteins of SARS-CoV and MERS-CoV

4.1 Cathepsin L: Endosomal Activator of the Spike Protein

The role of cathepsin L in coronaviru s entr y has been discovered in the context of SARS-CoV infection. Initial studies showed that SARS-S-driven entry is pH-dependent (Hofmann et al. 2004; Huang et al. 2006; Simmons et al. 2004, 2005) but also discovered that exposure to low pH fails to trigger the membrane fusion activity of the S protein (Simmons et al. 2004), arguing that protons might indirectly promote SARS-CoV entry. Simmons and coworkers provided an explanation for this at first sight paradoxical finding: They showed that inhibitors of cathepsin L activity block SARS-S-driven entry into host cells, while recombinant cathepsin L can activate S protein-driven membrane fusion (Simmons et al. 2005), indicating that the pH-dependency of SARS-S-driven entry stems from protons being required for cathepsin L activity rather than from proto nation of SARS-S triggering the membrane fusion activity. Moreover, they demonstrated that trypsin treatment of cell-bound viruses allows SARS-S-driven entry into cells pretreated with a cathepsin L inhibitor while trypsin treatment of cell-free particles abrogated infectivity (Simmons et al. 2005). Thus, activation of the S protein at the cell surface can override the need for endosomal cathepsin L activity for SARS-S-driven entry and is likely promoted by S protein interactions with ACE2. These findings established cathepsin L as CoV-activating protease and are in keeping with previous reports demonstrating a role of cathepsin L in reovirus uncoating (Ebert et al. 2002) and Ebola virus glycoprotein (EBOV-GP) activation (Chandran et al. 2005).

What is known about cathepsin L expression and physiological functions? The cathepsin family encompasses serine (cathepsins A and E), aspartic (cathepsins D and E), and cysteine proteases (cathepsins B, C, F, H, K, L, O, S, V, X, and W in humans) (Turk et al. 2012). The cysteine cathepsins are localized to lysosomes and cleave a variety of extra- and intracellular substrates, preferentially after basic or hydrophobic residues. The cathepsins B, H, L, C, X, F, O, and V are ubiquitously expressed and seem to be required for protein degradation and turnover in a cell type-independent fashion. In contrast, expression of cathepsins K, W, and S is cell type- or tissue-specific, suggesting more specialized functions (Lecaille et al. 2002). Although a slightly acidic pH is required for activity of cysteine cathepsins and exposure to neutral pH may irreversibly abrogate enzym atic activity (Turk et al. 1995), these enzymes may also be localized and active in compartments other than lysosomes. For instance, cleavage of histones by cathepsin L in the nucleus has been reported and may regulate the cell cycle (Goulet et al. 2004), while activity of cathepsins in the extracellular space may contribute to degradation of the extracellular matrix and the resulting pathologies (Fonovic and Turk 2014; Obermajer et al. 2008). Cysteine cathepsins are generated as preproenzymes: An N-terminal signal peptide facilitates ER import and is removed cotranslationally; the remaining propeptide is required for proper folding of the protein and for transport into endo- and lysosomes in a mannose-6-phosphate receptor (M6PR)-dependent fashion (Hasilik et al. 2009; Saftig and Klumperman 2009; Turk et al. 2012). Moreover, the propeptide bloc ks the substrate binding site and thereby prevents premature activity of the enzyme. Finally, the propeptide is removed either by autocatalytic cleavage or by other proteases, resulting in the generation of mature, proteolytically active enzymes, which may be present as single-chain or double-chain (attached by a disulfide bond) forms (Turk et al. 2012).

The demonstration that cathepsin L can activate SARS-S, at least in cell lines (Huang et al. 2006; Simmons et al. 2005), raises the question at which site the S protein is processed by this protease. Bosch and colleagues have demonstrated with recombinant proteins that cathepsin L cleaves SARS-S at T678 (Bosch et al. 2008), which represents a region in which furin cleavage occurs in other CoV S proteins. However, it remains to be investigated whether this residue is indeed required for SARS-S activation by cathepsin L during viral entry. The S1/S2 cleavage site, defined by R667 (Belouzard et al. 2009; Follis et al. 2006; Simmons et al. 2011), and the S2′ site, defined by R797 (Belouzard et al. 2009), are required for SARS-S activation by trypsin. However, both sites are dispensable for cathepsin L-dependent, SARS-S-driven host cell entry (Belouzard et al. 2009; Simmons et al. 2011). It remains to be determined whether both sites are indee d not recognized by t his protease or whether cathepsin L can activate SARS-S at surrogate sites, in case R667 and R797 are not available. The latter possibility would be in keeping with the low substrate specificity of cathepsin L.

Several CoVs other t han SARS-CoV can use cathepsin L for S protein activation, including PEDV (Liu et al. 2016), MHV (Burkard et al. 2014; Qiu et al. 2006), HCoV-229E (Kawase et al. 2009), and MERS-CoV (Gierer et al. 2013; Qian et al. 2013; Shirato et al. 2013; Yang et al. 2015). Although MERS-S activation by cathepsin L has not been observed by all studies (Burkard et al. 2014), these results indicate that inhibitors targeting this protease might display broad anti-CoV activity. A notable exception is HCoV-NL63, which was reported to enter target cells in a pH-dependent but cathepsin L-independent fashion (Huang et al. 2006). Although these results are not undisputed (Hofmann et al. 2006), they suggest that NL63-S might exploit endosomal proteases other than cathepsin L for entry and cysteine cathepsins with substrate specificity and expression similar to cathepsin L are potential candidates. In sum, cathepsin L can activate diverse CoV upon endosomal entry (Fig. 4.4). The mechanisms controlling choice of cathepsin L versus other CoV-activating proteases as well as their role in CoV spread in vi vo have only recently been discovered and will be discus sed in the next section.

Fig. 4.4
figure 4

Activation of coronavirus spike proteins by host cell proteases occurs at different stages in the viral life cycle. Binding of the viral spike (S) protein to a cellular receptor can induce endocytosis of virions. In the endosome, the pH-dependent cysteine protease cathepsin L (CatL) can activate the S protein for fusion within the endosomal membrane. Alternatively, receptor binding may expose a protease cleavage site and may thus promote S protein activation at the plasma membrane by type II transmembrane serine proteases (TTSPs) or furin. Membrane fusion allows the release of the viral genome into the cytoplasm, the site of viral genome replication and protein translation. The S protein is synthesized in the constitutive secretory pathway, where some S proteins can be cleaved by furin or other pro-protein convertases during passage through the trans-Golgi network (TGN). Finally, nascent virions are assembled at the endoplasmic reticulum/Golgi intermediate compartment and are released from infected cells through exocytosis

4.2 Activation of the Spike Protein by Type II Transmembrane Serine Proteases at the Cell Surface

Type II transmembrane serine proteases (TTSPs) hav e been identified as activators of viral infection by Böttcher and coworkers, who showed that the TTSPs TMPRSS2 and HAT cleaved and thereby activated FLUAV-HA, at least upon directed expression in cell lines (Böttcher et al. 2006). Subsequent studies showed that TMPRSS2 can also activate HA upon endogenous expression in cell lines (Bertram et al. 2010b; Böttcher-Friebertshäuser et al. 2011) and provided evidence that this protease is expressed in FLUAV target cell in the human respiratory tract (Bertram et al. 2012), suggesting that TMPRSS2 could promote FLUAV spread in the infected host. Indeed, Hatesuer and colleagues (Hatesuer et al. 2013) as well as subsequent studies (Sakai et al. 2014; Tarnow et al. 2014) demonstrated that mice lacking tmprss2 are largely resistant to spread and pathogenesis of several FLUAV subtypes and could link this finding to absence of HA activation. Moreover, polymorphisms in the TMPRSS2 gene in humans which increase TMPRSS2 expression were shown to be associated with severe influenza, suggesting that this protease might also promote FLUAV spread in humans (Cheng et al. 2015). Finally, it is noteworthy that several other TTSPs can activate HA upon directed expression in cell culture, including TMPRSS4, DESC1, MSPL, and matriptase (Baron et al. 2013; Beaulieu et al. 2013; Bertram et al. 2010b; Chaipan et al. 2009; Hamilton et al. 2012; Zmora et al. 2014) and that TMPRSS4 has recently been shown to promote spread of a H3N2 FLUAV in mice that showed partial TMPRS S2-independence (Kuhn et al. 2016).

TTSP are membrane-anchored serine proteases that play an important role in several physiological processes, including m aintenance of homeostasis (Antalis et al. 2010, 2011; Szabo and Bugge 2011). They exhibit a characteristic domain organization: An N-terminal cytoplasmic tail is followed by transmembrane domain, a stem region, and a C-terminal protease domain (Bugge et al. 2009; Hooper et al. 2001). The cytoplasmic tail might be involved in targeting the protease to the cellular membrane, while the transmembrane domain anchors the proteins in the plasma membrane (Bugge et al. 2009). The stem region has a strictly modular organization and may be composed of up to 11 different protein domains (Antalis et al. 2011; Hooper et al. 2001). The number and configuration of these domains is characteristic for specific TTSPs, and the stem region is known to function in protein-protein interactions or protein-ligand interactions (Hooper et al. 2001). Finally, the protease domain harbors a conserved catalytic triad of histidine, aspartate, and serine, which is essential for enzymatic activity. TTSPs are synthesized as inacti ve pro-proteins, zymogens, and are either autoactivated or activated by another protease. Activation requires cleavage at a site located at the interface between stem region and protease domain and may result in shedding of the enzymatically active protease domain into the extracellular space (Antalis et al. 2010; Bugge et al. 2009; Hooper et al. 2001).

Three studies independently demonstrated that TMPRSS2 does not only activate FLUAV-HA but also cleaves and activat es SARS-S (Glowacka et al. 2011; Matsuyama et al. 2010; Shulla et al. 2011). They found that directed expression of TMPRSS2 in target cells allowed SARS-S-driven entry, despite previous treatment of cells with lysosomotropic agents (i.e., elevated endosomal pH) or cathepsin L inhibitors. These results indicate that TMPRSS2 activates virion-associated SARS-S early during viral entry and thereby renders entry independent of cathepsi n L activity (Fig. 4.4). Activation is believed to occur at the plasma membrane, likely after S protein binding to ACE2. Notably, ACE2 and TMPRSS2 interact (Shulla et al. 2011), and it is conceivable that conformational changes in SARS-S that are induced upon SARS-S binding might expose the TMPRSS2 cleavage site in the S protein. SARS-S activation by TMPRSS2 was only observed when S protein and protease were located in different membranes (i.e., viral and cellular membranes, respectively) and thus depends on SARS-S cleavage in trans (Glowacka et al. 2011; Matsuyama et al. 2010). However, TMPRSS2 can also cleave SARS-S when both proteins are localized in the same membrane (cis-cleavage), at least upon directed expression, and this may result in shedding of soluble SARS-S into the extracellular space, where the S protein can serve as decoy for neutralizing antibodies (Glowacka et al. 2011).

The cleavage site of TMPRSS2 in S ARS-S is largely unclear, although one can speculate that R797 might be involved. One report suggested that R667 might be dispensable for SARS-S cleavage by TMPRSS2, but a quantitative analysis was not provided (Bertram et al. 2011). In contrast, R667 was found to be essential for SARS-S processing by HAT (Bertram et al. 2011), although it was not determined if this residue is also required for SARS-S activation by this protease. In this context, differences in SARS-S activation by TMPRSS2 and HAT should be noted: TMPRSS2 can activate SARS-S for cell-cell and virus-cell fusion in trans (Bertram et al. 2011; Matsuyama et al. 2010). In contrast, HAT can only activate SARS-S for cell-cell but not virus-cell fusion, and activation is observed in both the cis and the trans setting (Bertram et al. 2011). Whether these observations reflect general differences in the activation reaction or simply mirror the somewhat less efficient expression of HAT as compared to TMPRSS2 in transfected cells remains to be investigated. Finally, it is noteworth y that besides TMPRSS2 and HAT, several other TTSPs can cleave and activate SARS-S. Thus, DESC1 and MSPL can cleave and activate SARS-S in trans (Zmora et al. 2014), at least upon directed expression, and thus seem to function in a TMPRSS2-like fashion. Cleavage of SARS-S by recombinant TMPRSS11A has also been demonstrated, and both R667 and R797 were identified as cleavage sites (Kam et al. 2009). Moreover, exposure of SARS-S-bearing particles to recombinant TMPRSS11A augmented entry into cultured respiratory epithelium, and mutation of R797 reduced entry into these cells to background levels. In contrast, mutation of R667 had only a modest effect (Kam et al. 2009). Finally, it should be highlighted that the exploitation of TTSPs for S protein activation is not limited to SARS-CoV: TMPRSS2 expression facilitates cathepsin L-independent 229E-S- and MERS-S-driven entry into target cells (Bertram et al. 2013; Gierer et al. 2013; Kawase et al. 2012; Shirato et al. 2013, 2017). Moreo ver, acquisition of use of human proteases, including TMPRSS2, for S protein activation has been suggested to be a determinant of zoonotic transmission of MERS-CoV-related viruses from bats to humans (Yang et al. 2014, 2015). TMPRSS2 has also been reported to play a role in PEDV infection. In this context, protease activity seems to be required for efficient release of progeny virions from infected cells (Shirato et al. 2011). The underlying mechanism is unknown, but one can speculate that either modulation of S protein glycosylation (Bertram et al. 2010a) or, more plausible, TMPRSS2-mediated inactivation of an antiviral host cell factor might be responsible.

The findings discussed above indicate that S protein activation by TTSPs is a complex process and is governed, among other factors, by the localization of the S protein in joint or in opposite membra nes. Another layer of complexity is added by the observation that TMPRSS2 cannot only cleave the SARS-S protein but can also process its entry receptor ACE2 (Heurich et al. 2014). Thus, TMPRSS2 and the metalloprotease a disintegrin and metalloproteinase domain 17 (ADAM17) cleave ACE2 close to its transmembrane domain, and cleavage may result in ACE2 shedding (Haga et al. 2008; Heurich et al. 2014). Moreover, it has been proposed that ACE2 cleavage by ADAM17 is required for efficient SARS-S-driven entry (Haga et al. 2008), while ACE2 processing by TMPRSS2 seems to account for the augmentation of viral infectivity observed upon directed expression of TMPRSS2 in target cells (Heurich et al. 2014). These findings suggest that TTSPs and other proteases can impact S protein-driven entry by ways other than S protein activation, but the underlying mechanism remains to be investigated.

What is the evidence that TMPRSS2 and potentially other TTSPs promote coronavirus spread in the infected host? For HAT, DESC1, MSPL, and TMPRSS11A, the evidence is limited to the demonstration of mRNA and/or protein expression in the lung and to S protein activation upon directed expression of protease or addition of recombinant protease (Bertram et al. 2012; Kam et al. 2009; Zmora et al. 2014). In contrast, a constantly accumulating body of evidence suggests an important contribution of TMPRSS2 to SARS-CoV spread in the host: TMPRSS2 is coexpressed with ACE2 in the human lung (Bertram et al. 2012), and TMPRSS2-positive cells were found to harbor SARS-CoV antigen in experimentally infected cynomolgus macaques (Matsuyama et al. 2010). Moreover, blockade of TMPRSS2 activity reduced SARS-CoV infection of respiratory epithelium (Kawase et al. 2012). Finally, a serine protease inhibitor active against TMPRSS2 reduced viral spread and pathogenesis in a rodent mode of SARS-CoV infection, while blockade of cathepsin L activity had no appreciable effect (Zhou et al. 2015). These findings suggest that SARS-CoV, like FLUAV, might depend on TMPRSS2 for spread within hosts. Such a scenario would be in keeping with the findings that cathepsin L is not expressed in respiratory epithelium at levels sufficient for MERS-S activation (Park et al. 2016) and that HCoV-229E isolated from patients uses TMPRSS2 while viral variants adapted to growth in cell culture employ cathepsin L (Kawase et al. 2009; Shirato et al. 2017), suggesting that S protein activation by cathepsin L might be the result of cell culture adaptation. In sum, several lines of evidence suggest that TMPRSS2 but not cathepsin L activity is important for CoV spread in the infected host. This raises the question which determinants control whether cathepsin L or TMPRSS2 is used for S protein activation. Recent insights obtained for MERS-S activation provide interesting answers and will be discussed below.

4.3 Furin Can Activate Coronavirus Spike Proteins in the Constitutive Secretory Pathway of Infected Cells and During Viral Entry into Target Cells

Furin, a subtilisin-like serine protease, belongs to the family of pro-protein convertases (PPCs), wh ich comprises nine members (Seidah and Prat 2012). Seven of these enzymes process substrates at basic residues and are required for activation of various cellular proteins, including hormones, growth factors, and adhesion molecules. Cleavage occurs at single or paired basic residues, which fit the following rule: (R/K)Xn(R/K)↓ (Nakayama 1997; Seidah et al. 2013; Seidah and Prat 2012), with the arrow indicating the cleavage site, X indicating any amino acid, and n corresponding to a 0, 2, 4, or 6, respectively. Furin is ubiquitously expressed and found in the trans-Golgi-network (TGN) from where it can be transported to the cell surface and back again via the endosomal compartment (Bosshart et al. 1994; Molloy et al. 1994). Two PPCs, SKI-1 and PCSK9, play a role in cholesterol/lipid homeostasis and cleave substrates at nonbasic residues (Seidah et al. 2013; Seidah and Prat 2012). Like cathepsin L and TTSPs, PCCs are synthesized as zymogens and the presence of a prosegment, which is removed by autocatalytic activation but remains non-covalently associated with the protease, prevents premature activity.

Many CoV S proteins harbor a furin motif at the S1/S2 site, and processing of, for example, the S protein of MHV, strain A59 (de Haan et al. 2004), and IBV (Yamada and Liu 2009) by furin has been demonstrated. Moreover, the insertion of a furin motif in the S protein of PEDV allows for trypsin-independent viral spread in cell culture (Li et al. 2015). The contribution of furin to SARS-S activation is less clear. It has been documented that a pro-protein convertase inhibitor blocks SARS-CoV spread in cell culture, but mutational analysis failed to demonstrate robust processing of the S protein by this protease (Bergeron et al. 2005). Moreover, insertion of a furin motif at the S1/S2 site augmented SARS-S-driven cell-cell but not virus-cell fusion (Follis et al. 2006). In contrast, a prominent role of furin in MERS-S activation has been documented. Thus, it has been shown that furin can cleave MERS-S at the S1/S2 site during S protein biogenesis in the constitutive secretory pathway of infected cells and at the S2′ site during S protein-driven entry into target cells (Millet and Whittaker 2014). Blockade of furin expression or activity reduced MERS-S-driven entry (Burkard et al. 2014; Millet and Whittaker 2014), indicating that furin is an activator of MERS-S (Fig. 4.4), although the requirement of furin activity for MERS-S-driven entry has not been observed by a separate study (Gierer et al. 2015) and thus might be cell type specific to some extent (Millet and Whittaker 2014).

The observation that furin is an activator of certain CoV S proteins raises the question which determinants control whether S proteins are activated by furin, cathepsin L, or TMPRSS2. An intriguing answer has been pr ovided by a recent study by Park and colleagues. They showed that cleavage of MERS-S at the S1/S2 site in infected cells determines whether MERS-S is activated by cathepsin L or TMPRSS2 during viral entry into target cells (Park et al. 2016). Thus, only precleaved MERS-S seems to be able to undergo the conformational changes upon receptor binding that are required for activation by TMPRSS2. If the S protein is uncleaved and thus conformationally rigid, binding to receptor results in viral uptake into endosomes, where the S protein is activated by cathepsin L (Park et al. 2016). However, this activation pathway seems to be less robust as compared to activation by TMPRSS2 and does not allow efficient entry into cells within respiratory epithelium, due to expression of insufficient amounts of cathepsin L (Park et al. 2016). In sum, cleavage at the S1/S2 site in infected cells can determine protease choice during entry into target cells, with only cleaved S proteins exhibiting suffic ient conformational flexibi lity for activation at the cell surface by TMPRSS2.

Conclusions

The cleavage activation of the S protein of coronaviruses by host cell proteases is required for viral infectivity, and the responsible enzymes constitute potential targets for antiviral intervention. Studies within the recent years provided interesting insights regarding the nature of the S protein-activating proteases, the mechanisms that control protease choice, and the contribution of specific enzymes to viral spread in the infected host (Fig. 4.4). The pH-dependent cysteine protease cathepsin L can activate the S proteins of SARS-CoV, MERS-CoV, PEDV, and other pathogenic coronaviruses upon viral uptake into endosomes. However, cathepsin L might not be sufficiently expressed in respiratory epithelium to support viral spread in this important target tissue, and, at least for some CoVs, efficient S protein activation by cathepsin L might be the result of viral passaging in cell lines. Such a scenario would be compatible with the finding that EBOV-GP is activated by cathepsin B and L for entry into cell lines (Chandran et al. 2005), while expression of these proteases is dispensable for efficient viral spread in mice (Marzi et al. 2012). The type II transmembrane serine protease TMPRSS2 activates FLUAV-HA and is essential for spread of diverse FLUAV in rodent and likely also human hosts. Similarly, TMPRSS2 activates the S proteins of SARS-CoV and MERS-CoV and is expressed in cells in the human respiratory epithelium that also express the SARS-CoV receptor, ACE2. Moreover, TMPRSS2 activity is required for efficient SARS-S- and MERS-S-driven entry into cultured respiratory epithelium, and a protease inhibitor active against TMPRSS2 suppresses SARS-CoV spread and pathogenesis in a rodent model. Finally, pre-cleavage of MERS-S by furin in infected cells is essential for subsequent S protein activation by TMPRSS2 for entry into target cells, potentially by providing the S protein with increased conformational flexibility. Thus, TMPRSS2 is an attractive antiviral target, and specific inhibitors of this enzyme might exert activity against a broad spectrum of respiratory viruses. Initial efforts to generate such inhibitors have been documented (Meyer et al. 2013), and compounds with high specificity for TMPRSS2 can be expected to suppress viral spread without inducing unwanted side effects, since tmprss2 is dispensable for normal development and homeostasis in mice (Kim et al. 2006).