Introduction

Antiviral drug targets are prone to mutations under the selective pressure of drug therapies. These mutations contribute to drug resistance by reducing the activity of inhibitors while allowing the drug-resistant variant of the target to function on the native substrates. The delicate balance between inhibitor binding and substrate recognition is effectively altered by drug resistance mutations at the expense of the inhibitor. The substrate-envelope hypothesis provides the structural basis for this alteration. This review provides a general background on the evolution of drug resistance in viral proteases, specifically in human immunodeficiency viral protease and hepatitis c viral protease, NS3/4A. The general applicability of the substrate-envelope hypothesis to other systems is discussed and a framework for substrate-envelope-guided drug design is outlined to minimize the probability of drug resistance in the design of new inhibitors.

Evolution of Resistance Against Anti-HIV Drugs

Human immunodeficiency virus (HIV) is a lentivirus of the Retroviridae family that infects the human immune system and causes the acquired immunodeficiency syndrome (HIV/AIDS). HIV is a quickly evolving disease that, without effective treatment, results in serious medical, social, and economic burden. UNAIDS reports that 35.3 million people were living with HIV globally with 2.3 million new infections and 1.6 million people died from AIDS-related causes by the end of 2012 (Global report: UNAIDS report on the global AIDS epidemic 2013). HIV has two types and several clades within each type with distinct patterns of spread and progression to AIDS (Santos and Soares 2010). HIV type 1 (HIV-1) is responsible for the pandemic. Because HIV-1 cannot be cured, suppressing viral replication and maintaining viral load at low to undetectable levels have become critical goals in the field of HIV-1 research. Highly active antiretroviral therapy (HAART) has been a successful strategy in providing long, quality life for infected individuals and is the current global standard of care for HIV/AIDS patients (Palella et al. 1998; Hoggs et al. 1998). As a part of HAART, the US Food and Drug Administration (FDA) has approved more than 30 drugs that target various stages of viral replication cycle including fusion and entry, reverse transcription, and integration and proteolytic processing of viral polyproteins. However, a high frequency of random nucleotide misincorporation by the error-prone reverse transcriptase (about three mutations per virion per round of replication) and a huge reservoir of replicating virus (1010 infected cells in an average patient) diversify the viral population (Coffin 1995). The selective pressure of therapy, especially combined with low drug adherence, facilitates the emergence of drug resistance viral variants (Ali et al. 2010).

HIV-1 Protease: A Virally Encoded Protease as a Drug Target

Viral genome is translated as polyproteins, which are proteolytically processed by the virally encoded protease to yield functional and structural proteins. Due to this crucial role in the viral life cycle, HIV-1 protease has been a key drug target in the treatment of HIV/AIDS (Kohl et al. 1988). HIV-1 protease is an aspartyl protease and a symmetric homodimer of 99 amino acids each (Fig. 1a). Each monomer contains a flap comprising two antiparallel β-strands connected by a β-turn and situated on top of the catalytic site. Dimeric enzyme is stabilized by four antiparallel β-strands, two from each subunit, which form an interdigitated β-sheet. Substrates are hydrolyzed at the dimer interface. The active site is typically considered as the residues 25–32, 47–53, and 80–84 with each monomer contributing a catalytic triad (Asp-25/Thr-26/Gly-27). Accurate and precise processing of the viral polyproteins is critical for virion assembly and maturation; therefore, HIV-1 protease cleaves the Gag and Gag-Pol polyproteins at twelve known sites in a highly specific order. While hydrophobic residues are favored at P1/P1′ residues, between which the scissile bond is hydrolyzed, in general, the cleavage sites are nonhomologous in sequence and asymmetric in size and charge (Fig. 1b). The fact that the protease is symmetric and the cleavage sites are diverse and asymmetric has challenged complete understanding of the specificity determinants of substrate recognition using a sequence-based approach.

Fig. 1
figure 1figure 1

(a) HIV-1 protease is a homodimeric aspartyl protease, shown in ribbon. Identical monomers A and B are colored in blue and green. Substrate-binding region is located at the dimer interface, and a bound-substrate peptide is colored in magenta. (b) HIV-1 protease recognizes 12 sites on Gag and Gag-Pol polyproteins and cleaves the scissile bond between P1 and P1' residues. (c) More than half of the protease gene mutates under the selective pressure of protease inhibitor involving therapies. Major drug resistance mutations and resistance-associated mutations are colored red and dark blue, respectively. Major drug resistance mutations are labeled on monomer A, while resistance-associated mutations are labeled on monomer B. The catalytic triad, at the dimeric interface, is colored yellow. (d) FDA-approved drugs targeting HIV-1 protease

Development of HIV-1 PIs is regarded as a major success of structure-based rational drug design. Nine protease inhibitors (PIs) have been so far approved for clinical use: saquinavir (SQV) (Roberts et al. 1990), indinavir (IDV) (Dorsey et al. 1994), ritonavir (RTV) (Kempf et al. 1995), nelfinavir (NFV) (Kaldor et al. 1997), amprenavir (APV) (Kim et al. 1995), lopinavir (LPV) (Sham et al. 1998), atazanavir (ATV) (Robinson et al. 2000a), tipranavir (TPV) (Turner et al. 1998), and darunavir (DRV) (Fig. 1d) (De Meyer et al. 2005; Koh et al. 2003; Surleraux et al. 2005). All PIs, except for TPV, are peptidomimetics. These PIs were rationally designed to bind to the protease with the flaps of the enzyme tightly closed over the active site, mimicking the transition state between substrate binding and cleavage reaction and thereby effectively inactivating the enzyme.

As PIs are an essential component of HAART (Gulick et al. 2000, Bartless et al. 2001), drug resistance to PIs has become an issue in the failure of HAART. Mutations at almost half of the protease residues are selected in different combinations with drug treatment and some combinations confer drug resistance (Wu et al. 2003; Rhee et al. 2003) (Figure 1c). Primary mutations in the active site reduce both protease catalytic efficiency and viral replicative capacity (Martinez-Picado et al. 2000, 1999; Croteau et al. 1997; Bleiber et al. 2001). Major PI resistance mutations occur at residues 30, 32, 33, 46, 47, 48, 50, 54, 76, 82, 84, 88, and 90, while mutations at residues 10, 11, 16, 20, 23, 24, 34, 35, 36, 43, 53, 58, 60, 62, 63, 64, 66, 69, 71, 73, 74, 77, 83, 85, 89, and 93 were reported to be selected in PI-treated patients, and some were shown to contribute to resistance (HIV Databases; Johnson et al. 2013). Among the major resistance-causing mutations, D30N, V32I, I47V/A, G48V/M, I50V/L, V82A/F/T/S/L, and I84V are located in the active site, while L33F, M46I/L, I54V/T/A/L/M, L76V, N88S/D, and L90M are non-active site mutations.

Mutations in HIV-1 protease, either within or outside the active site, can contribute to drug resistance directly by impacting inhibitor binding or indirectly in an interdependent and cooperative manner. Most primary mutations in the active site reduce binding affinity of PIs. On the contrary, some non-active site mutations are located in the hydrophobic core of the protein (13, 24, 33, 36, 62, 66, 77, 85, 90, 93) and contribute to resistance by altering the exchange dynamics of the hydrophobic interactions within the core (Foulkes-Murzycki et al. 2007; Mittal et al. 2012). While certain primary resistance mutations are a signature of particular PIs, cross-resistance is an issue in HIV-1 PIs. For example, D30N is a nonpolymorphic NFV-selected mutation which confers phenotypic and clinical resistance to NFV (Rhee et al. 2003; Patick et al. 1998); however, I50L is selected with ATV treatment and confers high-level ATV resistance while significantly increasing susceptibility to the rest of the PIs (Colonno et al. 2004). On the contrary, I50V is selected in APV-, LPV-, and DRV-treated patients and reduces the efficacy of these PIs while increasing TPV efficacy (HIV Databases). V82A is selected primarily by IDV and LPV (Condra et al. 1996; Kantor et al. 2005). In addition to decreasing susceptibility to IDV and LPV, V82A also confers cross-resistance to ATV and NFV and is associated with decreased susceptibility to SQV and APV in combination with other mutations (Condra et al. 1996; Kempf et al. 2001). I84V is a very severe mutation that is selected by each of the available PIs and cause cross-resistance to most PIs (Rhee et al. 2003; HIV Databases). Similarly, G48V is a primary resistance mutation selected by SQV and less often IDV and LPV conferring high-level resistance to SQV, intermediate resistance to ATV, and low-level resistance to NFV, IDV, and LPV (Rhee et al. 2003; Kantor et al. 2002; Schapiro et al. 1996; Rhee et al. 2010). Mutations have been selected at either single or a combination of sites. The mechanisms by which resistance is conferred via these mutations are very complex and interdependent. Nevertheless, in addition to the accumulation of resistance mutations within the active site, mutations also develop in non-active site protease residues and within the substrate cleavage sites, predominantly at NC-p1 and p1-p6 sites, altering the susceptibility to various PIs (Zhang et al. 1997; Bally et al. 2000; Mammano et al. 1998; Maguire et al. 2002; Kolli et al. 2009). Evolution of mutations within the cleavage sites leads to not only improved viral fitness compared to the viral variants carrying protease resistance mutations (Zhang et al. 1997; Mammano et al. 1998; Doyon et al. 1996; Robinson et al. 2000b) but also often increased resistance (Kolli et al. 2009). The vast number of mutation sites in both the protease and substrates with several possibilities of amino acid substitutions at each site in combination with cross-resistance has proven drug resistance a very complex problem.

The Substrate-Envelope Hypothesis

HIV-1 protease is a structurally well-studied drug target with more than 600 entries in the Protein Databank as of December 2013 (Berman et al. 2000). A vast majority of these entries are co-crystal structures of small-molecule inhibitors with HIV-1 protease variants including drug-resistant forms. These structural studies shed light on the molecular mechanisms by which protease mutations render inhibitors less effective; however, investigating only the inhibitor complexes has not been sufficient as a rational drug design strategy to minimize the likelihood of emerging resistance mutations.

Resistance to available drugs occurs when the balance in molecular recognition is subtly altered. Drug-resistant variants of HIV-1 protease are no longer effectively blocked by the competitive inhibitors but are still active against the natural substrates efficient enough for viral survival. This observation leads to the assumption that the native function of the protease imposes an evolutionary constraint under the selective pressure of drug therapy. To discover robust drugs that can resist drug resistance, the balance between natural substrate recognition and inhibitor binding needed to be characterized at the molecular level.

The substrate-envelope hypothesis, which was established on the basis of crystallographic studies on HIV-1 protease and later shown to be valid for HCV NS3/4A protease, provides a structural explanation for the specificity determinants of natural substrate recognition and drug resistance upon primary mutations in the protease active site. According to the substrate-envelope hypothesis, the inhibitors that are better at mimicking the natural substrate-binding features are less susceptible to the rapidly emerging mutations populated upon drug treatment. In this section, crystallographic studies that lead to the substrate-envelope hypothesis are described in detail, focusing on the substrate specificity and drug resistance in HIV-1 protease. In addition, parallels in the molecular basis of resistance against HIV-1 and HCV NS3/4A protease inhibitors are highlighted, and the up-to-date evidence suggesting that mutational ensembles of NS3/4A protease can also be targeted rationally taking a substrate-envelope-based drug design approach.

Structural Basis of Substrate Specificity and Drug Resistance

HIV-1 protease, a symmetric enzyme, specifically recognizes diverse asymmetric sequences on the Gag and GagProPol (Prabu-Jeyabalan et al. 2000). Amino acid sequence alone is not the specificity determinant for asymmetric substrate recognition, but the substrates share a binding mode in an extended conformation (Prabu-Jeyabalan et al. 2002). Co-crystal structures of decameric peptides corresponding to the cleavage sites showed that HIV-1 protease recognizes a consensus shape in substrates, not necessarily a consensus sequence (Prabu-Jeyabalan et al. 2002). This consensus shape is defined by the volume adopted by the majority of the substrates within the protease active site and has been defined as the substrate envelope (Fig. 2a). According to the substrate-envelope hypothesis, the substrate envelope is the recognition motif for HIV-1 protease, and the cleavage sites within Gag that are able to adopt this shape are likely to be processed.

Fig. 2
figure 2

(a) HIV-1 protease-substrate and inhibitor envelopes are colored blue and red, respectively. The two envelopes were superimposed to highlight the regions where inhibitors protrude beyond the substrate to make more extensive contacts with the protease residues that correspond to the previously known sites of drug resistance (Figure modified from King et al. (2004a)). (b) Hepatitis C virus NS3/4A protease-substrate envelope (blue) and a small-molecule inhibitor of NS3/4A protease, danoprevir, are shown in comparison along with the binding site residues (Figure modified Romano et al. (2010))

HIV-1 PIs that have been approved for clinical use are all low-molecular-weight compounds with fairly similar three-dimensional shape and electrostatic character, and they all have large, hydrophobic moieties that interact with the mainly hydrophobic S2-S2′ pockets in the binding site. In the co-crystal structures, most HIV-1 PIs adopt a very similar binding mode interacting with a common set of protease residues in the active site. The inhibitors were shown to occupy a consensus inhibitor volume within the binding site, termed the inhibitor envelope (Fig. 2) (King et al. 2004a). Based on the structural comparison of the inhibitor and substrate envelopes, the inhibitors were shown to protrude beyond the substrate envelope and make favorable contacts with certain protease residues of the wild-type protease. Because these protease residues interact more favorably with the inhibitors than the natural substrates, these protease residues are more important for inhibitor binding than substrate recognition. Strikingly, the protease residues contacted by inhibitors outside the substrate envelope corresponded to the previously known drug resistance mutation sites. Mutations at these sites would specifically impact inhibitor binding, while substrate recognition and cleavage would be less affected. Most sites of drug-resistant mutations in the active site do not contact the substrates, which led to the hypothesis that the inhibitors that fit well within the substrate envelope would be less susceptible to drug resistance, because a mutation that affects inhibitor binding would simultaneously impact the recognition and processing of the majority of the substrates (King et al. 2004a). As a retrospective validation, of the currently prescribed inhibitors, the most efficacious is DRV, and although not designed using the substrate-envelope constraint, DRV fits well within this volume (King et al. 2004b; Lefebvre and Schiffer 2008).

Sequence diversification in the protease is not the only mechanism for the virus to develop resistance to PIs. Occasionally, secondary mutations in the cleavage sites are also seen in patients who have failed PI-containing regimens. Crystallographic studies coupled with molecular dynamics simulations on wild-type and coevolved substrate complexes have revealed the structural rationale for why certain cleavage sites are more susceptible to resistance than others and how the cleavage site mutations compensate for the substrate processing efficiency lost upon protease mutations. The substrate-envelope hypothesis allowed quantitative assessment of the fit of each substrate within the substrate envelope. These studies, first, showed that some substrates are less in consensus with the majority of the substrates in terms of the shape adopted within the binding site, including NC-p1 and p1-p6 cleavage sites (Ozen et al. 2011). These substrates, along with inhibitors, interact favorably with a small subset of resistance mutation sites in the protease, e.g., D30, I50, and V82 (King et al. 2004a; Ozen et al. 2011). Strikingly, the outlier substrates, NC-p1 and p1-p6, correspond to the cleavage sites at which mutations were observed in patients who failed PI-containing regimens (Kolli et al. 2009, 2006). The substrate-envelope hypothesis, based on structural evidence, suggests that these substrates protrude beyond the substrate envelope and contact the sites of drug resistance mutations in the protease, leading to impaired substrate recognition and cleavage. This results in coevolution of compensatory mutations within the protease cleavage sites but often at other positions within the cleavage site (King et al. 2004a).

Emergence of D30N/N88D mutations in the protease in a correlated manner with the L449F Gag mutation on the p1-p6 cleavage site is a good example demonstrating that the protease-substrate coevolution validates the substrate envelope as the recognition motif for HIV-1 protease. D30N, a nelfinavir-signature protease mutation, is selected with high frequency in nelfinavir-treated HIV-infected individuals. From co-crystal structures, nelfinavir is known to pick up critical interactions with D30 at one monomer of the protease, which makes nelfinavir hypersusceptible to D30N mutation. Residue 88, on the contrary, does not directly interact with nelfinavir, but the N88D mutation is thought to maintain the overall local charge in the D30N background because 88 is in close proximity of 30 (Kolli et al. 2006).

However, the resistance mechanism through L449F Gag mutation in p1-p6 cleavage site is not obvious from the nelfinavir-bound crystal structures. A complete understanding of resistance against nelfinavir requires special attention to the substrate specificity and the mechanisms by which substrate specificity is maintained by the drug-resistant virus. Evidently, p1-p6 interacts with D30 on the other monomer outside the substrate envelope (i.e., p1-p6 interacts with D30 on the other monomer more than the majority of the substrates). Therefore, D30N mutation interferes with both nelfinavir-binding and p1-p6 processing with likely minimal detrimental effects on the recognition of other substrates. Substituting the wild-type leucine with bulkier phenylalanine at Gag 449, corresponding to S1′ pocket of the binding site, compensates the loss of interactions with the D30N/N88D protease filling the substrate envelope much more efficiently (Ozen et al. 2012). As a result, drug therapy selects for the cleavage site mutations, which are able to restore the loss of fit within the substrate envelope and bring the cleavage site more in consensus with the majority of the substrates. In conclusion, coevolved mutations within the cleavage sites play a key role in the development of resistance and affect the virological response during therapy. The substrate-envelope hypothesis, in addition to specificity of the substrates, explains the development of resistance to various PIs and substrate coevolution.

Substrate-Envelope-Guided Drug Design

The substrate envelope can guide the development of robust PIs that retain potency against severely resistant HIV-1 protease variants. Based on the substrate-envelope hypothesis, the optimum strategy to minimize resistance is to design inhibitors that fit within the substrate envelope (Fig. 3). In retrospect, where the five drugs in clinical use specifically protrude outside, the substrate envelope correlates with the loss of affinity to drug-resistant proteases (Chellappan et al. 2007a). Meanwhile, DRV, the most potent of the currently prescribed inhibitors, fits well within the substrate envelope although not designed using the substrate envelope as a constraint (King et al. 2004a; Lefebvre and Schiffer 2008). Retrospective correlation of the substrate envelope with resistance mutations promoted the design of new inhibitors with substrate-envelope constraints (Nalam and Schiffer 2008; Altman et al. 2008; Nalam et al. 2010; Ali et al. 2006; Chellappan et al. 2007b).

Fig. 3
figure 3

Substrate-envelope-based drug design. (a) Most severe resistance mutations (red) occur at sites contacted by competitive inhibitors (orange) outside the substrate envelope (blue). (b) Dynamic substrate envelope can be defined as a probability distribution of the consensus substrate shape within the binding site by combining molecular dynamics simulations and three-dimensional grid-based volume calculations. (c) Dynamic substrate envelope can be integrated into structure-based design of robust drugs by systematically optimizing two metrics: (1) Vout, the probabilistic volume of an inhibitor falling outside the dynamic substrate envelope, and (2) Vremanining, the portion of the dynamic substrate envelope that is not fully occupied and, therefore, can be better utilized by an inhibitor

To validate the substrate-envelope hypothesis, various groups designed new HIV-1 PIs on the hydroxyethylamine scaffold taking different approaches. Two computational methods incorporated the substrate envelope as an a priori constraint during the design stage of the inhibitors, while the third method employed a structure-activity relationship (SAR) that does not include the substrate-envelope constraint explicitly. The first computational design, based on optimized docking, resulted in two good candidates exhibiting flat affinity profiles against multidrug-resistant mutants, although the binding affinity of these candidates were in the nM range (Chellappan et al. 2007b). The second computational design systematically explored the combinatorial space for three constituent R groups on the same scaffold in two rounds of computational design, chemical synthesis, biochemical testing, and crystallographic analysis. The second round resulted in low nM–pM range compounds, the majority of which have flatter resistance profiles against a wide range of drug-resistant viral variants (Altman et al. 2008). As a negative control, the inhibitors designed with the SAR approach resulted in pM inhibitors; however, they were significantly less potent against the resistant variants (Ali et al. 2006). These studies successfully validated the substrate-envelope constraint as a robust design strategy for HIV-1 PIs with improved susceptibility to resistance and yielded several leads for potential new drugs (Nalam et al. 2010).

When the designed inhibitors effectively mimic the wild-type substrate-binding features, a larger number of mutations in the protease and cleavage sites will be needed to alter the balance between the substrate recognition and inhibitor binding in favor of substrate recognition to achieve drug resistance. Using the substrate envelope as a constraint not only improves the efficacy of the new inhibitors against the known resistant variants of HIV-1 protease but also likely minimizes the chances of potential compensatory mutations in the cleavage sites.

Generality of the Substrate-Envelope Hypothesis

Crystal structure s typically capture a static image of the native state. To test the general applicability of the crystallography-based substrate envelope, the effect of substrate dynamics in the bound state was assessed by molecular dynamics simulations. In addition, drug targets other than HIV-1 protease were assessed in retrospect for the correlation of the substrate envelope with the mutational sensitivity.

Effect of Protein Dynamics on the Substrate Envelope: Dynamic Substrate Envelope

Protein dynamics is often neglected in drug design. Conformational ensembles of the native state are not readily accessible experimentally at atomistic level in a high-throughput manner. Computational methods can aid to estimate conformational dynamics at the expense of computational time. While the force field describing the molecular interactions can still benefit from improvements, the advancements in parallel computing architectures and algorithms have tremendously revolutionized the molecular dynamics field and increased the ability to simulate wider timescales and larger systems. The earlier simulations of an ~900 atom protein lasted 9 ps (McCammon et al. 1977), while protein-folding simulations as long as 1 ms (Lindorff-Larsen et al. 2011) or 50 ns simulation of an intact virion of one million atoms (Freddolino et al. 2006) can now be performed.

Taking advantage of these advancements, the substrate-envelope model was recently extended by considering the role of protein dynamics in the interactions of HIV-1 protease with its substrates. The dynamic substrate envelope, which was defined based on thousands of substrate conformers from molecular dynamics simulations, has turned out as a more accurate representation of protease-substrate interactions and better defined the substrate specificity for HIV-1 protease (Ozen et al. 2011). The dynamic substrate envelope, being a more realistic model, reproduced the essentials of the static substrate envelope, which was based on the crystal structures, validating the substrate envelope as a valid and realistic hypothesis but not a crystallographic artifact (Fig. 3b).

In addition, characterization of structural dynamics of a series of substrates provided insights into the interdependent nature of substrate recognition, which was not immediately evident in crystal structures. HIV-1 protease substrates all need to be recognized and processed by the same enzyme; however, the polyprotein processing is tightly regulated and premature/imprecise processing leads to noninfectious virions. Molecular dynamics studies showed that the substrates all possess common properties that allow the recognition by the protease, but also subtle differences in the interactions with the protease result in preferential recognition. The balance between the shape commonality (i.e., consensus volume) and sequence diversity of the substrates is maintained by interdependence within individual substrates in terms of conformational and sequence preferences. In conclusion, the interplay between the conserved and varied properties of the cleavage sites enables the preferential substrate recognition and regulation of substrate processing.

Application of the Substrate-Envelope Hypothesis to Other Drug Targets

Applicability of the substrate-envelope hypothesis has been tested for five prospective drug targets from a diverse set of diseases: Abl kinase, chitinase, thymidylate synthase, dihydrofolate reductase, and neuraminidase (Kairys et al. 2009). The volume of inhibitors protruding beyond the native substrate envelope trended with average mutational sensitivity, suggesting that inhibitor design would benefit from a similar reverse engineering strategy for these enzymes. Similarly, the two reverse transcriptase inhibitors, AZT and 3TC, have elements protruding beyond the native substrate envelope formed by deoxyribonucleotides. These elements create an opportunity for the reverse transcriptase to develop resistant mutations at the deoxyribonucleotide binding site. However, tenofovir, a reverse transcriptase inhibitor designed with the substrate-envelope constraints, lacks such protrusions and is relatively effective against AZT-resistant HIV variants (Tuske et al. 2004). Finally, the substrate envelope rationalized drug resistance against hepatitis C viral serine protease NS3/4A inhibitors (Romano et al. 2010). NS3/4A is described below as an emerging candidate to target with the substrate-envelope approach.

Substrate Envelope of Hepatitis C Viral Serine Protease NS3/4A

Hepatitis C is a liver disease with significant global impact, which is also caused by an RNA virus of Flaviviridae family. The hepatitis C virus (HCV) infection can lead to liver cirrhosis and hepatocellular sarcoma and is the most common reason for liver transplants in the United States (US). The World Health Organization estimates 150 million people worldwide are infected with HCV and 3–4 million new infections coming up every year with more than 350,000 cases of death from HCV-related liver diseases (Lesage et al. 2009). Similar to HIV, HCV is also genetically highly diverse. So far, six major HCV genotypes and several subtypes within each genotype have been identified (Simmonds et al. 2005). High viral replication rate combined with the error-prone RNA-dependent RNA polymerase causes large inter-patient genetic diversity as well as viral diversity within a single infected individual (Bukh et al. 1995a, b). Genetic heterogeneity of the virus across and within patients has greatly challenged the development of robust direct-acting antiviral agents (DAAs) that retain efficacy against multiple genotypes and drug-resistant variants of these genotypes since the discovery of HCV in 1989 as the cause of the hepatitis C (Choo et al. 1989; Kuo et al. 1989). Until 2011, the standard of care for HCV was weekly injections of pegylated interferon α combined with ribavirin (Peg-IFN/RBV), which can result in undetectable levels of HCV in 70–80 % of people with genotypes 2 and 3 but only 40–50 % of people with genotype 1 (Lesage et al. 2009). Genotype 1, the most difficult genotype to treat, is also the most common form of HCV in the US accounting for about 75–80 % of the cases (Alter et al. 1999; Blatt et al. 2000). In 2011, two DAAs, telaprevir and boceprevir, were approved by FDA for clinical use in combination with Peg-IFN/RBV for the treatment of genotype 1 patients. In addition to the problem of drug resistance, the severe side effect profile of this combination therapy amplifies the need to develop widely effective and better-tolerated DAAs.

Among the drug targets against HCV is the nonstructural protein 3 (NS3), which is a 631-amino acid bifunctional protein, with a serine protease domain located in the N-terminal one-third and an NTPase/RNA helicase domain in the C-terminal two-third (Fig. 4a). The reason for the protease and helicase domains to be physically linked is not fully understood. Although their interplay has been reported (Beran and Pyle 2008; Beran et al. 2007, 2009; Frick et al. 2004), both domains fold independently and are active in the absence of the other (Beran and Pyle 2008; Frick et al. 2004; Beran et al. 2007; Lam et al. 2003; Gallinari et al. 1998). NS3/4A protease adopts a chymotrypsin-like fold with two β-barrel domains. The catalytic triad is formed by His-57, Asp-81, and Ser-139 and is located in a cleft separating the two domains. The structure is stabilized by a Zn+2 ion that is coordinated by Cys-97, Cys-99, Cys-145, and His-149. The most efficient proteolytic activity of NS3 requires a cofactor NS4A, a 54-amino acid peptide that is tightly associated with the protease (Lesage et al. 2009). NS4A aids in the proper folding of NS3; the central 11 amino acids of NS4A inserts as a β-strand to the N-terminal β-barrel of NS3. The HCV genome encodes a single polyprotein of ~3,000 acids, which is processed by a series of host and viral proteases into at least 10 structural and nonstructural proteins. The viral NS3/4A hydrolyzes the polyprotein precursor at four cleavage sites (3-4A, 4A-4B, 4B-5A, 5A-5B), yielding nonstructural proteins essential for viral maturation. The first proteolytic event occurs at 3-4A junction in cis as a unimolecular reaction, while processing of the remaining junctions 4A-4B, 4B-5A, and 5A-5B occurs bimolecularly in trans (Bartenschlager et al. 1994). Similar to HIV, the cleavage sites of NS3/4A protease are nonhomologous except for an Asp/Glu at P6, Cys/Thr at P1, and Ser/Ala at P1′ (Fig. 4b). NS3/4A also confounds the innate immune response to viral infection by cleaving the human cellular targets TRIF and MAVS and to block toll-like receptor three signaling and RIG-I signaling, respectively (Chen et al. 2007; Heim 2013; Li et al. 2005a, b). Cleavage of another cellular target, TC-PTP, at two separate sites enhances EGF signaling and basal Akt activity (Brenndorfer et al. 2009). Very recently, DDB1, a core subunit of the Cul4-based ubiquitin ligase complex, was reported to play a critical role in HCV replication and get cleaved by NS3/4A (Kang et al. 2013). Thus, in addition to blocking the viral maturation, effective inhibition of the proteolytic activity of the NS3/4A may also exert indirect antiviral effects, further interfering with viral replication.

Fig. 4
figure 4figure 4

(a) Hepatitis C viral NS3 is a bifunctional protein with helicase and protease domains, shown in ribbon. Helicase and protease are colored in light pink and teal, respectively. A cleaved substrate product is shown in magenta. (b) NS3/4A proteases recognized four sites on the ~3,000-amino acid viral polyprotein and a series of host cellular proteins. (c) Major drug resistance mutations and resistance-associated mutations are labeled red and dark blue, respectively. The catalytic triad is colored yellow. (d) NS3/4A protease inhibitors targeting NS3/4A protease, telaprevir and boceprevir, were approved in May 2011. Three of the macrocyclic NS3/4A inhibitors in development were shown as examples

The very shallow binding site of HCV NS3/4A protease has presented a big challenge to develop high-affinity and low-molecular-weight inhibitors because engineering small-molecule inhibitors to pick up tight interactions at the shallow surface was not straightforward. However, product inhibition by the N-termini of the trans-cleavage sites formed the basis for the development and optimization of peptidomimetic inhibitors of the NS3/4A protease (Steinkuhler et al. 1998; Llinas-Brunet et al. 1998; De Francesco and Migliaccio 2005). The proof of concept for antiviral efficacy was first demonstrated in 2002 with the macrocyclic inhibitor BILN-2061 (ciluprevir), which was later discontinued due to concerns about its cardiotoxicity (Lamarre et al. 2003; Hinrichsen et al. 2004; Vanwolleghem et al. 2007).

Telaprevir and boceprevir, the FDA-approved NS3/4A inhibitors, were developed by Vertex and Schering-Plough, respectively (Fig. 4d). Both telaprevir (Perni et al. 2006; Kwong et al. 2011) and boceprevir (Malcolm et al. 2006) are acyclic ketoamide inhibitors that associate with the protease through a reversible, covalent bond with the catalytic serine (S139) as well as short-range molecular interactions with the binding site. In addition, several non-covalent inhibitors, including macrocyclic compounds, are currently at various stages of clinical development. The non-covalent acylsulfonamide inhibitors contain a macrocycle connecting either P1 and P3 groups (ITMN-191 or danoprevir (Seiwert et al. 2011)) or alternatively P2 and P4 groups (MK-5172 (Harper et al. 2012), MK-7009, or vaniprevir (Liverton et al. 2010)), reducing the entropic cost associated with binding the shallow surface on the protease. In addition to the reported resistance in replicon studies, HCV quickly evolves to confer resistance to these protease inhibitors even at early stages of clinical trials compromising their high efficacy (He et al. 2008; Kieffer et al. 2007; Lin et al. 2005; Sarrazin et al. 2007; Tong et al. 2008, 2006). Despite the subnanomolar potency, the macrocyclic inhibitors also select for drug resistance mutations in clinic. Most PIs, in clinic or development, are susceptible to a common set of protease mutations, which raises the issue of cross-resistance. However, level of susceptibility to different mutations varies with drug. For example, R155K, A156T, and D168A are three mutations that are observed in patients treated with both linear and macrocyclic inhibitors, but the macrocyclic inhibitors appear to be more susceptible to R155K than linear compounds (Fig. 4c) (Romano et al. 2012).

Limitation of the current drugs to a single genotype and their susceptibility to quickly emerging resistance mutations pushes the research for developing inhibitors with broader activity. The substrate-envelope hypothesis has aided in elucidating the mechanism by which the protease mutations confer resistance to the current inhibitors.

High-resolution co-crystal structures have been determined for the wild-type NS3/4A protease domain with the cleavage products as well as inhibitors, including telaprevir, boceprevir, simeprevir, danoprevir, MK-5172, and vaniprevir (Romano et al. 2010, 2012). In these structures, the products, despite the low sequence homology, adopted a consensus volume at P6 to P1 residues, the substrate envelope. Similar to HIV, the most severe resistance mutations occur at protease residues that are contacted by the inhibitors outside the substrate envelope.

Crystal structure s of the resistant protease variants bound to telaprevir and three macrocyclic inhibitors in development, danoprevir, vaniprevir, and MK-5172, revealed the structural basis of the three major active site resistance mutations, R155K, A156T, and D168A (Romano et al. 2010, 2012). The protease residue 155 is contacted much more favorably by the carbamate-linked bulky isoindoline groups of vaniprevir and danoprevir outside the substrate envelope compared to telaprevir, boceprevir, and MK-5172. Therefore, a mutation at this residue renders the isoindoline-containing compounds less effective, while MK-5172 retains reasonable affinity against R155K protease since MK-5172 has an ether-linked quinoxaline group that packs against the conserved catalytic His-57.

However, the fold change in affinity against R155K protease varies with inhibitor. The wild-type Arg-155 participates in an electrostatic network of hydrogen bonds along the binding surface. This network involves residues His-57, Arg-155, Asp-168, and Arg-123. Substituting the arginine at 155, which can make two hydrogen bonds, with a lysine, which can make only one hydrogen bond, disrupts this electrostatic network and compromises the stability of the binding surface. Although both danoprevir and vaniprevir have favorable interactions with R155, mutation has more detrimental effect on the binding affinity of vaniprevir than danoprevir because vaniprevir has a linker connecting the bulky P2 isoindoline to P4, whereas danoprevir lacks this linker. Molecular dynamics simulations, in consistent with the crystallographic temperature factors and the inhibitor conformations in multiple molecules in the asymmetric unit, suggest that the lack of this linker renders the P2 group to be locally flexible without altering the binding mode of the inhibitor core. This local flexibility likely tolerates the instabilization of the binding surface due to R155K, while vaniprevir, constrained with P2-P4 macrocycle, cannot escape from the destabilizing effects of R155K. As a result, a 4.5 fold change is observed in the loss of affinity against the R155K protease between danoprevir and vaniprevir (Ozen et al. 2013).

Although the flat binding surface of NS3/4A is difficult to target, the comparative analysis of substrates and chemically diverse small-molecule inhibitors supports that a substrate-envelope-based design approach has the potential to result in more robust novel inhibitors. Taking this approach, considering conformational dynamics is probably even more critical than HIV because even the bound compounds have unique flexibilities, which have critical implications for drug resistance.

Conclusions and Future Perspective

Drug resistance will occur anytime rapid growth and evolution exists under the selective pressure of drug treatment but the growth is not completely inhibited by the drug. This widespread problem is in everything from invasive cancers and pathogenic microbes such as bacteria, malaria, fungi, tuberculosis, and viruses. The mechanisms by which resistance can emerge include point mutations in the target protein. To overcome drug resistance, resistance should be predicted before it happens, and drugs should be designed accordingly to avoid the accurately predicted resistance mutations. To achieve this goal, target identification is critical. The enzymes with multiple substrates that cannot easily tolerate mutations and maintain function are potentially good candidates.

Crystallography is extremely informative to provide insights into the most probable molecular interactions in the native state. However, proteins are dynamic and exist in conformational ensembles even in native state. Depending on the inherent structural and dynamic properties of the drug target, ignoring protein dynamics may delay the successful discovery of novel drugs that have high potency, good selectivity, and low toxicity and are also robust against the evolution of resistance. Developing these robust drugs, experimental techniques and computational methods should be used in concert, each according to its particular strengths. Dynamic substrate envelope is a useful tool to systematically incorporate the protein dynamics and evolution into structure-based rational drug design. Substrate-envelope-guided drug design necessitates constant partnering of multiple disciplines such as chemical synthesis, thermodynamics and enzyme kinetics, crystallography, NMR, molecular modeling and dynamics simulations, deep sequencing, and virology.

The current understanding of the structure and dynamics of substrate recognition and drug resistance in HIV and HCV proteases will serve as a useful guide for the rational design of future generation drugs that remain active against diverse populations of drug targets. Combating quickly evolving diseases, all drug targets should be viewed as evolutionarily dynamic, and inhibitors should be designed as evolutionarily constrained as possible. The target is moving and robust drug design requires hitting multiple targets at a time.