1 Introduction

Coronaviruses and their closest relatives from the order Nidovirales have exceptionally large RNA genomes of about 30 kb and synthesize an extensive set of 5′-leader-containing, subgenome-length RNAs. The synthesis of these RNAs is mediated by the viral replicase–transcriptase complex (RTC), a large multisubunit complex that is comprised of more than a dozen proteins encoded by the viral replicase gene and other proteins. The RTC includes the key replicative proteins of the virus, such as RNA-dependent RNA polymerase (RdRp) and helicase activities, as well as enzymes that are thought to be involved in the processing and modification of viral and/or cellular RNAs, such as primase, endoribonuclease, exoribonuclease and ribose-2′-O-methyltransferase activities (for recent reviews, see Masters 2006; Ziebuhr 2005, 2008). The RTC is anchored through three replicase-gene encoded integral membrane proteins to intracellular membranes derived from the endoplasmic reticulum (Knoops et al. 2008; Masters 2006; Ziebuhr and Snijder 2007). Besides proteins encoded by the replicase gene, the RTC contains the nucleocapsid (N) protein (Almazan et al. 2004; Schelle et al. 2005) and several cellular factors, which have not been characterized in great detail (Shi and Lai 2005; van Hemert et al. 2008). It is becoming increasingly clear that the coronavirus replicase gene also encodes proteins that are not essential for viral RNA synthesis but are involved in viral pathogenicity and specific virus–host interactions (see below).

The single-stranded, positive-sense RNA genome of coronaviruses fulfills a double role, serving as an mRNA for the expression of the replicase gene and a template for the synthesis of genome- and subgenome-length minus-strand RNAs. A large part of the genome (about two-thirds) is occupied by the replicase gene, which consists of two (slightly) overlapping open reading frames (ORF) located in the 5′-proximal region of the viral genome RNA. The remaining 3′-terminal third of the genome is dedicated to encoding the four major structural proteins of the virus (S, E, M, and N) and a varying number of so-called accessory proteins (Fig. 6.1). Compared to other coronaviruses, SARS coronavirus (SARS-CoV) encodes an unusually large number of accessory proteins, some of which are involved in counteracting antiviral host responses (Chap. 4). The ORFs located downstream of the SARS-CoV replicase gene are expressed from eight subgenomic (sg) RNAs (Thiel et al. 2003) (Fig. 6.1). A peculiarity of the sgRNAs produced by coronaviruses and some other nidoviruses is that both the plus-strand genome RNA and all plus-strand sgRNAs share a short, so-called leader sequence at their 5′ ends, whose functional relevance is currently unknown (Spaan et al. 1983). It has been speculated that the minus-strand complement of the leader sequence (present at the 3′ ends of all minus-strand RNAs) might facilitate initiation of plus-strand RNA synthesis. Alternatively, the leader might promote the translation of sgRNAs, for example through the presence of sequence elements required for efficient recognition by the viral capping apparatus or other mechanisms enhancing translation initiation. The precise mechanism used to join the “leader” and “body” sequences of sgRNAs (which are located more than 20 kb apart in the genome) is unknown but is generally accepted to involve a discontinuous step during sg minus-strand RNA synthesis and be guided by complementary base-pairing between transcription-regulating sequences (for recent reviews, see Pasternak et al. 2006; Sawicki et al. 2007). The specific requirements for maintaining genome RNAs of an unparalleled size and synthesizing sg minus-strand RNAs in a discontinuous manner are reflected by an exceptional complexity of the protein machinery involved in viral RNA synthesis, which will be the topic of this review. With few exceptions, the proteins and mechanisms involved in genome replication and transcription are conserved among SARS-CoV and other coronaviruses. In this chapter, we will review recent work on the SARS-CoV RTC and refer to related work on other coronaviruses as appropriate.

Fig. 6.1
figure 1_6

SARS coronavirus (SARS-CoV) genome organization and expression. (a) Open reading frames in the SARS-CoV genome. The replicase gene is comprised of two large open reading frames (ORFs) which are shown in black. (b) RNAs produced in SARS-CoV-infected cells. The 5′-terminal ORF(s) expressed from specific RNAs is/are shown as boxes. The small black box indicates the 5′-leader sequence present on each of the viral RNAs. The replicase gene (ORFs 1a and 1b) are shown in black. To the right, SARS-CoV genomic and subgenomic RNAs as detected by Northern blotting using a 3′ end-specific probe are shown together with information on sizes (in kilobases) and names of proteins expressed from these RNAs. (c) Overview of the domain organization and proteolytic processing of the SARS-CoV replicase polyproteins, pp1a (486 kDa) and pp1ab (790 kDa). For comparison, the mouse hepatitis virus (MHV)-A59 replicase polyproteins are shown. The processing end-products of pp1a are designated nonstructural proteins (nsps) 1 to nsp11 and those of pp1ab are designated nsp1 to nsp10 and nsp12 to nsp16. Cleavage sites that are processed by the viral main protease, Mpro, are indicated by grey arrowheads; sites that are processed by the papain-like proteases 1 and 2 (PL1 and PL2), are indicated by white and black arrowheads, respectively. Ac, acidic domain; A, ADRP (ADP-ribose-1″-phosphatase); SUD, SARS-CoV unique domain; NAB, nucleic acid-binding domain; Rp, noncanonical RNA polymerase (putative primase); RdRp, RNA-dependent RNA polymerase; TM1, TM2, TM3, transmembrane domains 1, 2, and 3, respectively; HEL, helicase; ExoN, 3′-to-5′ exoribonuclease; NeU, NendoU (nidoviral uridylate-specific endoribonuclease); MT, ribose-2′-O-methyltransferase

2 Organization and Expression of the Coronavirus Replicase Gene

The SARS-CoV replicase gene contains more than 21,000 nucleotides and is comprised of two ORFs called ORF1a and ORF1b (Fig. 6.1) (Marra et al. 2003; Rota et al. 2003; Thiel et al. 2003). Upon infection of the cell, ORFs 1a and 1b are translated into two large polyproteins, pp1a and pp1ab, of approximately 490 and 790 kDa, respectively. Polyprotein 1ab is a C-terminally extended version of pp1a; pp1ab expression requires a programmed (–1) ribosomal frameshift prior to termination of ORF1a translation, resulting in continuation of translation in the ORF1b reading frame (Thiel et al. 2003). Frameshifting depends on two cis-acting RNA elements, a “slippery” heptanucleotide sequence and an RNA pseudoknot structure (see Chap. 6 for details). Based on in vitro translation experiments, it has been estimated that frameshifting occurs in roughly 30% of translation events (Brierley 1995; Dos Ramos et al. 2004; Herold and Siddell 1993; Namy et al. 2006). This results in a considerably higher amount of ORF1a-encoded proteins compared to ORF1b-encoded proteins, which is likely to be of biological relevance as the frameshifting mechanism is conserved in coronaviruses and other nidoviruses.

SARS-CoV pp1a and pp1ab are co- and post-translationally processed by two proteases, a papain-like protease (PLpro) and the main protease (Mpro, nsp5), resulting in 16 mature products called nonstructural proteins (nsps) 1–16 (Harcourt et al. 2004; Prentice et al. 2004; Snijder et al. 2003; Thiel et al. 2003). The PLpro domain is part of nsp3 and processes the nsp1|2, nsp2|3 and nsp3|4 cleavage sites. The Mpro cleaves the C-terminal part of pp1a/1ab at the remaining 11 sites (Fig. 6.1), releasing the most conserved proteins of the coronavirus RTC. The individual cleavage sites are thought to be processed with varying efficiencies, resulting in the presence of some reasonably stable intermediates. This has been studied in some detail for the proteins released by the PLpro activity (Harcourt et al. 2004). While the nsp1|2 and nsp3|4 sites are rapidly processed, the nsp2|3 site was found to be cleaved at a lower rate, giving rise to a relatively long-lived nsp2–3 intermediate in infected cells. The rapid cleavage of the nsp3|4 site results in fast separation of the domains processed by PLpro and those processed by Mpro. Even though an initial study of the nonstructural proteins in SARS-CoV-infected cells failed to detect processing intermediates (Prentice et al. 2004), the cleavage of interdomain junctions by Mpro is also thought to occur at varying rates (Fan et al. 2004). This is supported by data obtained for mouse hepatitis virus (MHV), showing that nsp4–10 is a relatively stable processing intermediate (Schiller et al. 1998). In addition to the sequence of the individual cleavage sites, the efficiency of processing appears to be influenced by the secondary structure of the substrate (Fan et al. 2005, 2004). Thus, cleavage of peptides with a high propensity to form β-sheets is enhanced, consistent with the substrate binding mode observed in crystal structures of Mpro in complex with inhibitors (Anand et al. 2003; Yang et al. 2003, 2005).

It is generally accepted that processing intermediates fulfil important functions that, in some cases, differ from those of the fully processed proteins. This strategy is commonly employed by plus-strand RNA viruses to expand the functional repertoire of the relatively small number of protein domains present in viral polyproteins (Dougherty and Semler 1993; Palmenberg 1990). The idea of different functions being associated with precursors and final processing products, respectively, is supported by the fact that coronavirus RNA synthesis requires ongoing protein synthesis, with minus-strand synthesis declining more rapidly than positive-strand synthesis after inhibition of translation by cycloheximide (Sawicki and Sawicki 1986).

3 Functions and Activities of Replicase Gene-Encoded Nonstructural Proteins

3.1 ORF1a-Encoded Nonstructural Proteins 1–11

The proteins processed from the N-terminal regions of the coronavirus replicative polyproteins 1a/1ab are highly divergent among coronaviruses. nsp1 proteins from group 1 and group 2 coronaviruses, respectively, are not evidently related to each other, whereas group 3 coronaviruses lack an nsp1-counterpart altogether, making nsp2 the most N-terminal processing product in this case. The significant sequence diversity is also evident when nsp1 proteins from the same group (1 or 2) are compared with each other (Almeida et al. 2007; Connor and Roper 2007; Snijder et al. 2003). A nuclear magnetic resonance structure was reported for SARS-CoV nsp1 (Almeida et al. 2007). The protein has a complex β-barrel fold flanked by disordered N- and C-terminal domains (see Chap.18 for details) and lacks statistically significant structural similarity to other cellular or viral proteins. Although the protein is not required for replication of SARS-CoV and MHV in cell culture (Brockway and Denison 2005;Wathelet et al. 2007; Züst et al. 2007), there is increasing evidence to suggest that the protein has important functions in vivo. SARS-CoV nsp1 was suggested to counteract cellular innate immune responses by inhibiting IFN signaling pathways, and recombinant SARS-CoV with an attenuating mutation in nsp1 was found to decrease the ability of the virus to replicate in cells with an intact IFN response (Wathelet et al. 2007). Narayanan and co-workers (Narayanan et al. 2008) reported that another SARS-CoV nsp1 mutant, but not the wild-type virus, induced high levels of INF-β, suggesting that nsp1 may also have a role in type I IFN induction, which contradicts findings reported earlier by others (Wathelet et al. 2007; Züst et al. 2007). Characterization of an MHV mutant expressing a C-terminally truncated form of nsp1 showed that the protein is a major pathogenicity factor of this virus (Züst et al. 2007). There is evidence to suggest that MHV nsp1 is involved in counteracting type I IFN signaling and/or the antiviral activities of IFN-induced proteins, whereas a role in suppression of IFN induction seems less likely.

Furthermore, SARS-CoV nsp1 was shown to (1) promote host mRNA degradation, (2) inhibit cellular translation (Kamitani et al. 2006; Narayanan et al. 2008) and (3) affect cell cycling (Wathelet et al. 2007). MHV nsp1 was reported to colocalize (and interact) with other subunits of the viral replicase–transcriptase at the site of viral RNA synthesis but appears to migrate to virion assembly sites at a later stage of infection (Brockway et al. 2004). It remains to be seen if this relocalization is linked to distinct functions of nsp1 at different stages of the replication cycle of MHV and, possibly, other coronaviruses.

For nsp2, no specific function has yet been identified. It has been suggested that nsp2 is a component of SARS-CoV virions (Neuman et al. 2008). The protein is not essential for replication in vitro as deletion of the entire nsp2 coding sequence was tolerated in both SARS-CoV and MHV (Graham et al. 2005). nsp2 deletion mutants grew to slightly reduced titers and RNA synthesis was diminished by about 50%, with all RNA species being equally affected. More recently, it was shown that the replication defects observed in MHV nsp2 deletion mutants could not be compensated by nsp2 expressed from either a subgenomic RNA or from a C-proximal position in the replicase polyprotein, when inserted between nsp13 and nsp14 (Gadlage et al. 2008).

nsp3 is a large, membrane-anchored multidomain protein. Despite significant sequence diversity among coronavirus nsp3 proteins, up to 16 putative functional domains, including ubiquitin-like, metal-binding, nucleic acid-binding, RNA chaperone-like, poly(ADP)-ribose-binding, protease, transmembrane (TM) and other conserved domains have been identified in nsp3 (Neuman et al. 2008; Ziebuhr et al. 2001). nsp3 has been proposed to be a major “hub” for protein–protein and protein–RNA interactions between viral and (possibly) cellular macromolecules (Imbert et al. 2008; Neuman et al. 2008). Many of the domains are common to all coronaviruses while some are group- or species-specific (Neuman et al. 2008; Snijder et al. 2003; Ziebuhr et al. 2001).

Among the conserved domains are papain-like proteases (PLpros) that cleave the N-terminal part of the polyproteins at up to three sites (Fig. 6.1). While most coronaviruses possess two PLpros (PL1pro and PL2pro), SARS-CoV and IBV encode only one active PLpro, which are orthologues of the PL2pro of other coronaviruses (Snijder et al. 2003; Thiel et al. 2003; Ziebuhr et al. 2001). The SARS-CoV PLpro employs a Cys–His–Asp catalytic triad and exhibits a narrow substrate specificity, with all three processing sites conforming to the consensus sequence LXGG (Harcourt et al. 2004; Thiel et al. 2003). Besides its important role in pp1a/pp1ab processing, the SARS-CoV PLpro has deubiquitinating activity (Barretto et al. 2005; Lindner et al. 2005) and shares structural features with cellular ubiquitin-specific proteases (see Chap. 18 for details) (Ratia et al. 2006; Sulea et al. 2005). The PL2pro homolog of HCoV-NL63, a group 1b coronavirus, also has ubiquitin-specific protease activity (Chen et al. 2007b), suggesting that this activity may be of general biological relevance to the coronavirus life cycle. SARS-CoV PLpro removes ubiquitin and ubiquitin-like modifiers (Ubl) from fusion proteins and debranches polyubiquitin chains, with a marked preference for Lys-48- over Lys-63-conjugated chains (Lindner et al. 2007). The C-terminal residues of ubiquitin and the ubiquitin-like modifier ISG15, LRLRGG, match the consensus sequence of SARS-CoV PLpro cleavage sites in pp1a/pp1ab very well (Harcourt et al. 2004; Thiel et al. 2003). SARS-CoV PLpro is able to distinguish between ISG15 and ubiquitin and preferentially cleaves ISG15-modified proteins. The exact mode of recognition of Ubl modifiers by PLpro is not known but might involve interactions between the Zn2+ ribbon domain connecting the α and β subdomains of the PLpro and the β-grasp fold of ubiquitin and additional, currently unknown, interactions. The deubiquitinating and deISGylating activities of PLpro have been speculated to be involved in (1) protecting viral and/or cellular proteins from degradation and (2) counteracting innate immune responses (Lindner et al. 2007; Sulea et al. 2005). Recently, PLpro was shown to inhibit IFN signaling by binding to IRF-3 and interfering with its hyperphosphorylation, dimerization and nuclear translocation (Devaraj et al. 2007). Surprisingly, the inhibition of IFN response was not abrogated by substitution of PLpro active-site residues, suggesting that the protease/deubiquitylase activity of PLpro was not involved. Slightly contrasting data were reported by Zheng and co-workers (Zheng et al. 2008). In this case, the reduction of the type I INF response induced by vesicular stomatitis virus was dependent on the proteolytic activity of the SARS-CoV PLpro.

Another nsp3 domain is the ADP-ribose 1″ phosphatase (ADRP) domain (also called X domain or macro domain) (Fig. 6.1). The domain is conserved across members of the genera Coronavirus, Torovirus and Bafinivirus (Draker et al. 2006; Schütze et al. 2006) and several other plus-strand RNA viruses (Gorbalenya et al. 1991). Viral ADRPs are related to a large family of macro domain proteins found in many cellular organisms. The protein family is named after the nonhistone domain of macroH2A histones. Macro domain proteins share a conserved fold and bind to (and, in some cases, process) a range of substrates related to ADP-ribose (Karras et al. 2005). Several members of the macro domain family have been shown to (1) bind poly(ADP)-ribose and/or poly(A) RNA and (2) hydrolyze ADP-ribose-1″-phosphate (Egloff et al. 2006; Karras et al. 2005; Neuvonen and Ahola 2009; Putics et al. 2005, 2006; Saikatendu et al. 2005; Xu et al. 2009). The ADRP domains of HCoV-229E, SARS-CoV and transmissible gastroenteritis virus (TGEV) have been shown to hydrolyze ADP-ribose-1″-phosphate to yield ADP-ribose and inorganic phosphate (Egloff et al. 2006; Karras et al. 2005; Neuvonen and Ahola 2009; Putics et al. 2005, 2006; Saikatendu et al. 2005; Xu et al. 2009). The molecular mechanisms of the dephosphorylation reaction have not been elucidated but are thought to differ from other phosphatases as there is no sequence and/or structural similarity between the respective enzymes (Allen et al. 2003; Egloff et al. 2006). Based on the structure of coronavirus ADRPs in complex with ADP-ribose, two conserved residues (Asn and His) were proposed to be important for activity (Egloff et al. 2006; Saikatendu et al. 2005; Xu et al. 2009). Substitutions of these residues by alanine abolished ADRP activity activity in an in vitro assay, supporting their critical role (Egloff et al. 2006; Putics et al. 2005).

The biological role of the ADRP in the coronavirus life cycle has not been established. Characterization of an HCoV-229E mutant (HCoV-229E-N1305A) expressing an inactive ADRP revealed no apparent defects in virus reproduction and RNA synthesis in a cell culture system (Putics et al. 2005). Similar observations were made for the corresponding MHV-A59 mutant (MHV-N1348A) (Eriksson et al. 2008), confirming that ADRP activity is dispensable for coronavirus replication in vitro. However, when characterized in vivo, the MHV-N1348A mutant was found to be attenuated. At high infection doses, the mutant and wild-type viruses replicated to similar titers in the liver of C57BL/6 mice but, in contrast to wild-type virus, the N1348A mutant failed to cause severe liver pathology. Similar observations were made in C57BL/6 IFNAR–/– mice, arguing against a major role of the ADRP activity in counteracting IFN-α host responses. The underlying mechanisms for the observed low pathogenicity of the MHV-N1348A mutant in the liver have not been identified conclusively but may be linked to a reduced expression of proinflammatory cytokines, in particular IL-6. The data support the biological relevance of the catalytic activity of coronavirus ADRP domains which, on the basis of in vitro observations, has previously been questioned by others (Egloff et al. 2006; Neuvonen and Ahola 2009). The biologically relevant substrate(s) of the ADRP domains of SARS-CoV and other coronaviruses remain(s) to be identified. There is increasing evidence to suggest that macro domains have evolved different functionalities using a conserved fold and may have more than one activity which might not necessarily be mutually exclusive in any one domain. A striking example to support this idea is the recent identification of another macro domain within the SARS-CoV nsp3, located downstream of the ADRP (Chatterjee et al. 2009) and representing the central domain (SUD-M) of the SARS-CoV unique domain (SUD). On the basis of the NMR structure of SUD-M the ADRP was identified as the closest structural homolog of SUD-M, even though the sequence similarity between the two domains is very low (5% identical residues). It is tempting to speculate that the SUD-M domain may have evolved from ADRP through gene duplication, similar to what was previously proposed for the two paralogous PLpro domains found in many coronaviruses (Ziebuhr et al. 2001). SUD-M was found to bind to single-stranded poly(A)-RNA while others (Tan et al. 2007) had previously reported a poly(dG)/poly(G)-binding activity for SUD. The reasons for these differences are currently unclear but might be due to the different domain boundaries of the proteins characterized in the two studies. The full-length SUD has also been shown to have metal-binding activity (Neuman et al. 2008) which likely resides in the N-terminal part of the protein and involves some of the six conserved cysteine residues (SARS-CoV nsp3 positions 393, 456, 492, 507, 550, and 623) and two conserved histidine residues (positions 539 and 613).

Additional domains with nucleic acid-binding activity are present in SARS-CoV nsp3 and probably other coronaviruses. The bacterially-expressed N-terminal domain of nsp3 consistently copurified with nucleic acids (Serrano et al. 2007). The N-terminal region of nsp3 is particularly rich in acidic residues and is therefore also referred to as acidic (Ac) domain (Ziebuhr et al. 2001). The domain is conserved in all coronaviruses but exhibits a low degree of sequence identity (Serrano et al. 2007). The NMR structure of the Ac domain revealed an N-terminal ubiquitin-like fold (residues 1–112) followed by a disordered tail particularly rich in glutamic acid residues (Serrano et al. 2007). The N-terminal globular domain differs from the ubiquitin-like fold by an elongated α2 helix and the presence of two additional helices, α3 and 310, which are important for RNA binding. The protein was found to preferentially bind (G)AU(A) sequences.

A further domain linked to RNA binding was identified in SARS-CoV nsp3 just downstream of the PLpro (Neuman et al. 2008). This domain, named nucleic acid-binding domain (NAB), is conserved in group 2 and 3 (but not in group 1) coronaviruses. Bacterially-expressed NAB exhibited ATP-independent double-stranded nucleic acid-unwinding activity, consistent with a possible chaperone function.

Coronavirus replication takes place at virus-induced double membrane vesicles (DMVs) (Gosert et al. 2002; Knoops et al. 2008). The TM domain present in nsp3 (Neuman et al. 2008; Snijder et al. 2003; Ziebuhr et al. 2001), along with TM domains in nsp4 and nsp6, is thought to be involved in tethering the viral replication–transcription complex to intracellular membranes. The topology of the SARS-CoV nsp3 TM domain was recently determined (Oostra et al. 2008). The data suggest that the N- and C-termini are located in the cytoplasm, thus placing the PLpro at the same face of the membrane as all its cleavage sites in pp1a/pp1ab. The domain traverses the membrane twice, while the third, central, predicted transmembrane helix does not appear to function as such. By analogy with the TM domains located in the equine arteritis virus nsp2 and nsp3 proteins (Snijder et al. 2001), whose role in DMV formation has been established earlier, Oostra and co-workers (Oostra et al. 2007, 2008) suggest that DMV formation in SARS-CoV infected cells is mediated by nsp3 and nsp4. They further hypothesize that the central non-TM hydrophobic domain might play an important role in this process by dipping into the membrane and inducing curvature of the membranes.

nsp4 is a tetra-spanning membrane protein (Oostra et al. 2008). Both termini are located in the cytoplasm (Oostra et al. 2007). TM helices 1 and 2 are connected by a long lumenal loop that is N-glycosylated (Oostra et al. 2007). The presumed critical role of nsp4 in DMV formation is supported by studies using a temperature-sensitive mutant of MHV-A59 in which substitution of the nsp4 Asn-258 residue with Thr led to impaired DMV formation, while polyprotein processing was not affected (Clementz et al. 2008). The critical role of nsp4, including its transmembrane-spanning regions 1–3, in coronavirus replication has been established in a recent reverse genetics study using MHV-A59 (Sparks et al. 2007). The study also showed that the C-terminal nsp4 residues Lys-398 to Thr-492 are dispensable for MHV replication.

nsp5 is the viral main protease (Mpro). It cleaves at 11 sites in the central and C-terminal pp1a/pp1ab regions (Thiel et al. 2003; Ziebuhr et al. 1995; Ziebuhr et al. 2000). Because of its prominent role in pp1a/pp1ab processing and the large body of structural and biochemical information available for this enzyme, the Mpro is considered an attractive target for the development of antivirals. The enzyme is distantly related to the 3C proteases of picornaviruses (hence its traditional name “3C-like protease,” 3CLpro) but diverged significantly from these viral homologs (Gorbalenya et al. 1989b). For example, in place of the typical Cys–His–Asp/Glu catalytic triad of picornaviral 3C proteases, the coronavirus Mpro employs a Cys–His catalytic dyad (Anand et al. 2003, 2002; Tan et al. 2005; Xue et al. 2008; Yang et al. 2003, 2005; Zhao et al. 2008). A water molecule was found to occupy the position of the third member of the catalytic triad in coronavirus Mpro structures and it has been suggested that this water molecule might stabilize the protonated His during catalysis. In common with many cellular and viral chymotrypsin-like proteases, coronavirus Mpro has a two-β-barrel fold which, in the coronavirus enzymes, is linked to a unique α-helical domain at the C-terminus (Anand et al. 2002). Mpro forms homodimers (Anand et al. 2003, 2002; Yang et al. 2003). Dimerization mainly occurs through interactions between the C-terminal domains of the two protomers in the dimer as well as a stretch of N-terminal residues (N-finger) (see Chap. 9 for details). Dimerization is generally believed to be a prerequisite for trans-processing activity. The dimeric structure is another feature that sets the coronavirus Mpro apart from 3C proteases which function as monomers. Over the past few years a large number of studies have provided significant insight into the structural details and dynamics of Mpro dimerization and their functional consequences. For a review of this work, the reader is referred to Chap. 9.

nsp6 is another TM protein. Both the N- and C-terminus are located cytoplasmically (Oostra et al. 2008), indicating an even number of TM helices and thus placing the Mpro on the same face of the membrane as all its substrates. The protein has seven putative TM helices. To satisfy the observed Nendo–Cendo localization of the protein, Oostra and co-workers (2008) proposed that only six of the predicted TM helices function as such in the context of the full-length protein and helix 6 or possibly helix 7 might not traverse the membrane. Further, it has been suggested that the nonmembrane-spanning helix may act as interaction platform for other replicase components or aid in the formation of DMVs. The function of nsp6 remains to be characterized.

nsp8 was recently reported to be a second “noncanonical” RNA-dependent RNA polymerase (Imbert et al. 2006) (see Chap. 18). The protein synthesizes short oligonucleotides of up to 6 nts and requires an RNA template and metal ions for activity. RNA synthesis was sequence specific, with a preference for the internal 5′-(G/U)CC-3′ trinucleotide sequence as site of initiation, but exhibited a relatively low fidelity and processivity. nsp8 is well conserved among coronaviruses. The noncanonical RdRp is therefore suggested to be an essential enzymatic activity involved in RNA synthesis in all coronaviruses and, possibly, other related nidoviruses. Imbert and co-workers (Imbert et al. 2006) hypothesized that nsp8 could act as a primase that produces RNA primers which are then extended by the main RdRp, nsp12, in a mechanism reminiscent of DNA replication.

nsp8 was shown to interact with nsp7 and together these proteins form a hexadecameric ring structure consisting of eight copies of each protein as shown by X-ray crystallography (Zhai et al. 2005). The complex encircles a channel with a diameter of ~30 Å that is lined with positively-charged residues (for details, see Chap. 18). Addition of nsp7 to the primase activity assay did not increase the primase activity. However, nsp8 exhibits a low thermostability and nsp7 might therefore act as a stabilizing mortar (Imbert et al. 2006; Zhai et al. 2005).

The structure of SARS-CoV nsp9 was solved by two groups (Egloff et al. 2004; Sutton et al. 2004). A distant structural relationship with domain II of the Mpro was noted, suggesting that both proteins may be evolutionary related and thus represent another example of domain duplication in the coronavirus replicase, similar to what was discussed above for coronavirus PLpro and macro domains. Furthermore, nsp9 is structurally related to proteins containing an oligosaccharide/oligonucleotide-binding (OB) fold, although the connectivity of the individual secondary structural elements differs in nsp9. A large proportion of OB-fold proteins bind nucleic acids and SARS-CoV nsp9 was confirmed to also bind ssRNA. Binding of ssRNA by nsp9 was unspecific but specificity might possibly be attained by interaction with other viral or cellular proteins. Sutton and co-workers (2004) showed that SARS-CoV nsp9 interacts with nsp8 in an analytical ultracentrifugation analysis. The experiment further suggested that this interaction might help stabilize an otherwise disordered domain of nsp8. nsp9 forms dimers through the interaction of parallel α-helices that contain a GXXXG protein–protein interaction motif (Miknis et al. 2009). Substitutions of either of the Gly residues with Glu disrupted dimer formation while RNA binding was only marginally affected. Viable SARS-CoV mutants carrying either mutation could not be recovered, suggesting that nsp9 dimerization is critical for virus replication.

nsp10 is another small replicase protein with RNA-binding activity. Two studies analyzing the structure of nsp10 described a single-domain protein that contains two Zn2+-fingers (Joseph et al. 2006; Su et al. 2006). nsp10 was shown to bind single- and double-stranded RNA and DNA with low affinity and no apparent specificity. nsp10 interacts with nsp9 as shown by cross-linking (Joseph et al. 2006) and in a GST-pulldown assay (Imbert et al. 2008). As nsp9, in turn, interacts with nsp8 and nsp8 forms a complex with nsp7 and all these proteins are involved in homotypic interactions (see above and Chap. 18), it is tempting to believe that these proteins form a multiprotein complex that, in the course of infection, undergoes structural rearrangements (possibly due to Mpro-mediated cleavages), thereby activating, modulating or inactivating specific RTC function(s) as required at the various steps of viral RNA synthesis.

Consistent with the presumed key role of Mpro processing in the formation of a functional RTC, viable MHV mutants could not be recovered if cleavage at the nsp7|8, nsp8|9 or nsp10|11(12) sites was abolished (Deming et al. 2007). Disruption of proteolytic processing at the nsp9|10 site gave rise to an attenuated, but viable phenotype. The MHV nsp9|10 cleavage site mutant restored near wild-type growth kinetics after serial passaging which was not caused by restoration of processing at the nsp9|10 site but by a number of compensatory mutations at distant positions in the viral genome. This suggests that nsp9–10 can function as a fusion protein, though efficient replication requires adaptation of the virus. The functional role of the specific changes identified in some of the recovered viruses remain to be characterized.

nsp11 forms the C-terminal part of pp1a. SARS-CoV nsp11 is an oligopeptide of 13 residues. It is produced if no programmed frameshift occurs at the “slippery sequence” (see Chap. 6), leading to translation termination at the ORF1a stop codon. nsp11 shares its N-terminal amino acids (upstream of the frameshift site) with nsp12. Most of the nsp11 coding sequence overlaps with the RNA sequence involved in frameshifting and the nsp12 coding sequence, posing severe constraints on the nsp11 sequence and arguing against a functional role of nsp11. The protein has also not been detected in infected cells.

3.2 ORF1b-Encoded Nonstructural Proteins 12–16

The RNA-dependent RNA polymerase (RdRp, nsp12) is the most conserved protein of the coronavirus replicase–transcriptase. Expression of nsp12 (and all other ORF1b-encoded proteins) requires ribosomal frameshifting, implying that nsp12–16 are produced at significantly lower levels compared to ORF1a-encoded functions. SARS-CoV nsp12 is a protein of 932 residues. The actual catalytic domain containing the conserved RdRp motifs (Gorbalenya et al. 1989b; Koonin 1991) occupies the C-terminal region (C-terminal domain, CTD) of nsp12 while the N-terminal domain (NTD) spanning the first 375 amino acids has no known counterpart in other RdRps (Xu et al. 2003). Xu and co-workers (2003) proposed a three-dimensional (3D) homology model for the SARS-CoV nsp12 CTD. This model showed the characteristic cupped right hand palm–finger–thumb structure encircling a nucleic acid-binding tunnel. The RdRp activity of nsp12 was recently confirmed by showing that bacterially-expressed nsp12 is able to extend an oligo(U) primer hybridized to a poly(A)-template (Cheng et al. 2005).

SARS-CoV nsp13 is a multidomain protein of 601 amino acid residues. The N-terminal region contains a zinc-binding domain (ZBD) while a helicase domain featuring the typical conserved morifs of superfamily 1 helicases is present in the C-terminal half (Gorbalenya et al. 1989a, 1989b). The ZBD contains 12 conserved cysteine and histidine residues that are predicted to form a binuclear Zn2+-binding cluster (Seybert et al. 2005; van Dinten et al. 2000). It is conserved in all nidoviruses and is critical for helicase activity in vitro (Seybert et al. 2005) and RNA synthesis of HCoV-229E in cell culture (Hertzig and Ziebuhr, unpublished). Coronavirus helicases (including SARS-CoV nsp13) were shown to unwind RNA and DNA duplexes in a 5′-to-3′ direction with respect to the single-stranded RNA they initially bind to (Ivanov and Ziebuhr 2004; Ivanov et al. 2004a; Seybert et al. 2000; Tanner et al. 2003). Translocation of nsp13 along RNA (and concomitant duplex unwinding) is fueled by NTP or dNTP hydrolysis. Consistent with many other helicases, the nsp13-associated NTPase/dNTPase activity is stimulated by nucleic acids (Heusipp et al. 1997). Additionally, nsp13 exhibits RNA 5′-triphosphatase activity which was proposed to catalyze the first step of the 5′-capping reaction of viral RNAs (Ivanov and Ziebuhr 2004; Ivanov et al. 2004a).

The N-terminal part of nsp14 contains a 3′-to-5′ exoribonuclease (ExoN) domain. ExoN is related to the DEDD superfamily of exonucleases (Zuo and Deutscher 2001) but carries an additional putative Zn2+-finger structure that is inserted between the conserved motifs II and III (Snijder et al. 2003). The ExoN activity of SARS-CoV nsp14 was demonstrated in vitro and shown to be specific for single-stranded and double-stranded RNA (Minskaia et al. 2006). The protein was shown to require metal ions for activity and isothermic titration calorimetry data suggest that nsp14 binds two Mg2+ ions per molecule (Chen et al. 2007a). This data, together with the profound reduction of activity upon substitution of putative metal ion-coordinating residues (Minskaia et al. 2006), suggests that catalysis occurs through a two-metal-ion mechanism similar to that used by many cellular enzymes mediating phosphoryltransfer reactions (Beese and Steitz 1991).

It has been proposed that coronaviruses and other nidoviruses with genome sizes of about 30 kb have evolved specific mechanisms to replicate their large RNA genomes. The argument is that, in the absence of such mechanisms, the intrinsically error-prone RdRps of RNA viruses would cross a (postulated) threshold of nucleotide misincorporations above which the survival of a given virus population becomes impossible. The question of how coronaviruses are able to maintain their exceptionally large RNA genomes has not been resolved but a number of recent observations suggest that nsp14 was critically involved in the evolution of these large genomes. First, coronaviruses were shown to encode a 3′-to-5′ ExoN that is related to cellular enzymes involved in proof-reading mechanisms during cellular DNA replication. Second, the ExoN activity is not conserved in nidoviruses with much smaller genomes (i.e., Arteriviridae), even though the pp1a/pp1ab domain organization is otherwise very well conserved among small and large nidoviruses (Gorbalenya et al. 2006; Snijder et al. 2003). Third, ExoN activity was found to be essential for HCoV-229E replication in cell culture (Minskaia et al. 2006). Transfection of genome-length RNAs containing substitutions of ExoN active-site residues failed to produce viable virus. Northern blot analysis of viral RNA in transfected cells revealed a severe reduction in genome replication, an altered molar ratio of sgRNAs and aberrant migration of two sgRNAs in some of the mutants (Minskaia et al. 2006). Consistent with these data, deletion of the ExoN domain from a SARS-CoV replicon was reported to abolish RNA synthesis and substitution of a putative ExoN active-site residue reduced genome replication and transcription about 10-fold in this system (Almazan et al. 2006). Fourth, the characterization of MHV-A59 ExoN active-site mutants revealed defects in viral RNA synthesis and rapid accumulation of mutations across the genome (Eckerle et al. 2007). The authors calculated that, during passage (and under strong selection pressure), MHV-A59 ExoN mutants accumulated approximately 15-fold more substitutions than the wild-type virus. The data obtained in these studies are consistent with the proposed role of ExoN in increasing the fidelity of the coronavirus RdRp, although there is still no direct proof for this specific function.

Coronaviruses encode a second conserved ribonuclease, NendoU (Nidoviral endoribonuclease, specific for U), which is located within nsp15 (Snijder et al. 2003). The domain is conserved not only in coronaviruses but also in all other members of the order Nidovirales. NendoU homologs could not be identified in any other RNA virus, which makes the domain a genetic marker of nidoviruses (Ivanov et al. 2004b). SARS-CoV nsp15 was shown to be an endoribonuclease that cleaves preferentially 3′ of uridylates and generates 2′–3′ cyclic phosphate ends (Bhardwaj et al. 2004; Ivanov et al. 2004b). The activity is significantly enhanced by Mn2+ while addition of Mg2+ or Ca2+ only had minor effects on activity. The structure of SARS-CoV nsp15 revealed a novel fold (see Chap. 18). In spite of unrelated structures and lack of sequence similarity, nsp15 and RNaseA were proposed to use the same catalytic mechanism (Ricagno et al. 2006). Residues forming the catalytic triad of bovine RNase A (His-12, Lys-41, His-119) could be superimposed with His/Lys residues known to be critical for activity of SARS-CoV nsp15 (Ivanov et al. 2004b) and both RNaseA and SARS-CoV NendoU generate 2′–3′ cyclic phosphate ends, suggesting that endoribonucleolytic cleavage by NendoU proceeds through the same catalytic mechanism (Ricagno et al. 2006). The proposed RNaseA-like catalytic mechanism does not involve metal ions and therefore the observed stimulatory effects of Mn2+ on NendoU activities of several (but not all) coronaviruses (Bhardwaj et al. 2004; Cao et al. 2008; Ivanov et al. 2004b; Xu et al. 2006) cannot readily be reconciled with a role in catalysis. Based on intrinsic tryptophan fluorescence data, Bhardwaj and co-workers (2004) suggested that Mn2+ ions induce specific conformational changes in the protein that might promote its nuclease activity. However, the available crystal structure information does not provide evidence for the presence of Mn2+ ion-binding sites in any of the coronavirus NendoUs studied (Ricagno et al. 2006; Xu et al. 2006). More recently, it was suggested that Mn2+ ions increase the RNA-binding activity of NendoU (Bhardwaj et al. 2006).

Coronavirus NendoUs form hexamers consisting of a dimer of trimers in crystals and in solution (see Chap. 18). Amino acid substitutions that interfere with hexamerization were reported to reduce the nucleolytic activity and RNA affinity of SARS-CoV NendoU, suggesting that hexamerization is critically involved in activity (Guarino et al. 2005). However, the presence of six independent active sites in the NendoU crystal structure, together with relatively minor differences in the K m and k cat values determined for monomeric and hexameric MHV NendoU (Xu et al. 2006) and the fact that maltose-binding protein–NendoU fusion proteins that are unable to form hexamers possess endoribonucleolytic activity (Ivanov et al. 2004b) suggest that hexamerization may not be essential for activity. It therefore remains possible that NendoU may be active prior to its proteolytic release from pp1ab, for example in the context of NendoU-containing processing intermediates or the full-length polyprotein.

The role of NendoU in the coronavirus life cycle is not well understood. Reverse genetics data obtained for HCoV-229E (Ivanov et al. 2004b) and MHV (Kang et al. 2007) suggest that NendoU activity is not essential for viral replication. Substitutions of NendoU active-site His and Lys residues in MHV were shown to cause subtle defects in sgRNA accumulation and reduce virus titers by about 10-fold. Substitutions of a conserved Asp residue abolished viral RNA synthesis, both in HCoV-229E and MHV (Ivanov et al. 2004b; Kang et al. 2007). Both structural information and biochemical data suggest that this particular residue may have an important structural role, suggesting that the observed defects in viral replication may be due to misfolding of nsp15 or other pp1a/pp1ab subunits rather than caused by the lack of NendoU activity. More studies are needed to understand the function of the conserved NendoU activity in nidoviral replication.

NendoU belongs to a family of proteins that is prototyped by a cellular endoribonuclease, called XendoU, from Xenopus laevis (Laneve et al. 2003). XendoU is a poly(U)-specific endoribonuclease that, together with ExoN and methyltransferase (MT) activities, is involved in the processing of intron-encoded small nucleolar (sno) RNAs and site-specific RNA methylation pathways in X. laevis oocytes (Laneve et al. 2003). It has been suggested (but not yet explored experimentally) that a similar set of activities associated with coronavirus nsp14 (ExoN), nsp15 (NendoU) and nsp16 (MT), which are coexpressed in the C-proximal pp1ab region, may be involved in related RNA-processing and methylation pathways (Snijder et al. 2003).

The most C-terminal processing product of pp1ab, nsp16, has been proposed to be related to the RrmJ/FtsJ family of S-adenosyl-methionine-dependent ribose-2′-O-methyltransferases (Feder et al. 2003; Snijder et al. 2003). The predicted MT activity was recently confirmed for nsp16 of feline coronavirus (FCoV) (Decroly et al. 2008). FCoV nsp16 was shown to methylate 7MeGpppACn at the ribose-2′-O moiety of the adenosine, converting a cap-0 to a cap-1 structure. The domain is critically involved in coronavirus replication as deletion or ablation of expression of nsp16 abolished RNA synthesis in SARS-CoV (Almazan et al. 2006) and HCoV-229E (Hertzig, Schelle and Ziebuhr, unpublished data) while substitution of one of the residues forming the catalytic tetrad reduced sgRNA synthesis about 10-fold in a SARS-CoV replicon system (Almazan et al. 2006). The 5′-cap structure is known to be critically important for the stability of cellular mRNAs and translation initiation (reviewed in Shuman 2001). The cellular capping apparatus is located in the nucleus, implying that viruses that replicate in the cytoplasm need to either provide all the enzymes required to produce RNA cap structures or employ alternative mechanisms, such as cap snatching (Plotch et al. 1981). In eukaryotic cells, the production of 7MeGpppG2′OMeN(2′OMe) cap-1 (and cap-2) structures is achieved through four consecutive enzymatic reactions: (1) removal of the RNA 5′ γ-phosphate by an RNA 5′-triphosphatase, (2) transfer of GMP to the remaining 5′ diphosphate end by a guanylyltransferase, (3) methylation of the guanine at the N7 position by a guanine-N7-methyltransferase, resulting in a cap-0 structure, (4) methylation of ribose-2′-O-moieties of the first (and second) nucleotide of the mRNA by a ribose-2′-O-methyltransferase, resulting in cap-1 and cap-2 structures, respectively (Langberg and Moss 1981; Shuman 2001). Coronaviral RNAs are modified at the 5′-end, probably with a cap structure (Lai et al. 1982), and it seems reasonable to suggest that the nsp16-associated ribose-2′-O-methyltransferase activity catalyzes the conversion of cap-0 into cap-1 structures whereas the nsp13-associated RNA 5′-triphosphatase activity might catalyze the first step of 5′-cap formation (Ivanov and Ziebuhr 2004; Ivanov et al. 2004a). Homologs of cellular guanylyltransferase and guanine-N7-methyltransferases have not been identified in coronaviruses and it remains to be seen what viral or cellular proteins/activities mediate the two remaining reactions, namely GMP transfer and guanine-N7 methylation, to synthesize RNA cap structures on coronavirus RNAs. While some + RNA viruses, such as alphaviruses, encode all four required activities (reviewed in Salonen et al. 2005), other viruses use one and the same domain to perform two different reactions. For example, the West Nile virus MT methylates at both the guanine N7 position and the ribose-2′-O moiety using an interesting substrate-repositioning mechanism and a single S-adenosylmethionine-binding site (Dong et al. 2008). While FCoV nsp16 had no guanine-N7-methyltransferase activity and did not bind unmethylated GpppACn (Decroly et al. 2008) the authors point out that recognition of unmethylated, guanylylated RNA might depend on regulatory RNA elements located further downstream in the RNA, similar to what was described for flaviviruses (Dong et al. 2007; Ray et al. 2006).

4 Future Directions

The emergence of SARS-CoV led to a renewed interest in coronavirology and recent years saw a significant increase in research involving this family of viruses. Many of the previously predicted coronavirus replicase gene-encoded enzyme activities were characterized by biochemical and structural approaches using viral proteins expressed in heterologous systems (Ziebuhr 2008). In a few cases, structural studies informed subsequent biochemical studies, resulting in the identification of RNA-binding domains and other functions (Egloff et al. 2004; Joseph et al. 2006; Zhai et al. 2005).

Most of the biochemical and structural studies reported over the past years involved isolated nsps or subdomains of these proteins rather than multidomain complexes. Although these studies provided invaluable new insight into structure–function relationships of many of these proteins, there is hardly any information regarding the quarternary structure(s) and subunit compositions of replicase and/or transcriptase complexes catalyzing specific reactions during viral RNA synthesis. For example, very little is known about the special factors and macromolecular interactions involved in discontinuous minus-strand RNA synthesis, a unique feature of coronaviruses and several other nidoviruses. Similarly, the replication and maintenance of the exceptionally large coronavirus genome is likely to involve the concerted action of replicase gene-encoded nsps, possibly including processing precursors and intermediates as pointed out above. The physical and functional interactions between the various components of the replicase–transcriptase probably undergo significant changes in the course of the viral replication cycle, further complicating the characterization of these complexes. Despite these major technical challenges, recent studies increasingly try to elucidate the functions and structures of complexes involving two or more proteins (Zhai et al. 2005). Also, the availability of reverse genetics systems for several coronaviruses has greatly facilitated the characterization of specific protein functions and critical interactions between domains encoded by very different regions of the replicase gene (Donaldson et al. 2007).

The recently established method for purification of entire functional (membrane-bound) RTCs (van Hemert et al. 2008) presents exciting new possibilities to study coronavirus RNA synthesis. Advanced imaging techniques including cryo-electron microscopy have provided fascinating new insight into key structures involved in virus replication. For example, cryo-electron microscopy of ribosomes stalled at a specific stage of frameshifting provided a glimpse of how the structure of the mRNA template affects ribosome function to mediate this frameshift (Namy et al. 2006), and electron tomography of SARS-CoV-infected cells provided a 3D view of the unique continuous reticulovesicular network, the formation of which is induced by the virus (Knoops et al. 2008). However, much remains to be studied to understand how the virus coaxes the cell to produce these structures. The precise role of these membrane structures in virus replication and, possibly, immune evasion need to be characterized in more detail.

Further progress has been made in our understanding of the interactions of the virus with the host cell, especially with regard to innate immune responses to coronavirus infections and the role of specific viral proteins in counteracting these antiviral host responses. The understanding of these pathways in combination with biochemical, structural and genetic approaches will continue to provide exciting insights into virus replication and virus−host interactions and form the basis for the development of antiviral drugs and new and better coronavirus vaccines.