Introduction

Cysteine proteases are one of the five major classes of proteolytic enzymes. Along with aspartic, serine, threonine proteases and metalloproteinases, they constitute the biocatalysts, which hydrolyze peptide bonds in various proteins [1]. The hallmark of cysteine proteases is the presence of cysteine residue in the enzyme’s active site. The nucleophilic thiol group of the catalytic cysteine forms a covalent bond with the carbonyl group of the scissile peptide bond in substrates [2].

Cysteine proteases comprise 82 families of enzymes classified into 14 clans. The clan CD includes the caspase family of cytosolic aspartate-directed endopeptidases involved in apoptosis and inflammation. The adenoviral endopeptidase and related proteins belong to the clan CE. Clans CF, CO and CP represent proteases with distinct tertiary scaffolds. Bacterial peptidases that hydrolyze and transfer bacterial cell wall peptides are included in the clan CL. Clans CM, CN and CQ contain viral polyprotein endopeptidases. Mixed nucleophile peptidases with varied types of activity are assigned to clans PA, PB and PC, and some self-processing proteins—to the clan PD [3]. However, the most abundant cysteine proteases share a common structural fold with papain, a plant protease isolated from Carica papaya fruits. Therefore, they are named papain-like cysteine proteases and grouped into the clan CA. This clan comprises 35 families, with the most numerous C1 papain family [24]. Papain-like cysteine proteases are mainly endopeptidases, yet some of them possess additional or exclusive exopeptidase activity. They are widespread in nature, being found in viruses and almost every group of living organisms, including bacteria, fungi, protists, plants, invertebrates and vertebrates [5].

Plant cysteine proteases are localized to the different types of vacuoles, the cell wall, chloroplasts, the “bodies” of endoplasmic reticulum or the cytosol. They are involved in multiple processes of cellular regulation in plants, such as: programmed cell death, processing of storage proteins before their deposition in developing seeds, mobilization of storage proteins during seed germination and seedling growth, organ senescence, tracheary element differentiation and intracellular protein turnover in response to abiotic or biotic stress [6]. Aforementioned papain, a component of the latex of the papaya tree, is accumulated and activated upon mechanical wounding of the papaya fruit and thus can serve as a part of the plant’s proteolytic defense system against external pathogens [7]. It is noteworthy that papain was the second enzyme and the first cysteine protease to be crystallized and to have its structure determined [8].

Animal cysteine proteases of the papain family, named cathepsins (Gr. καθεψειν [kathepsein], “to digest”), are active in a slightly acidic environment within the lysosome [9]. Eleven human cysteine cathepsins have been identified so far. Some of them are ubiquitously produced in all tissues (cathepsins: B, C, F, H, L, O and X), while others are confined to a specific cell or tissue type, where they play more particular roles (cathepsins: K, S, V and W) [10]. The distribution of cathepsins is not limited to lysosomal vesicles; these enzymes also occur, to a lesser extent, in other cellular compartments (e.g., nucleus, cytosol, plasma membrane) or are secreted into the extracellular milieu [1113]. Besides playing a pivotal role in intracellular protein turnover, cysteine cathepsins are involved in a variety of physiological processes (e.g., proenzyme activation, prohormone maturation, phagocytosis, major histocompatibility complex class II (MHC-II)-mediated antigen presentation, bone remodeling, cell cycle progression, apoptosis) [11, 14, 15]. However, alterations in cathepsins’ synthesis, activity and localization may contribute to the development of different pathologies (e.g., rheumatoid arthritis, osteoarthritis, osteoporosis, atherosclerosis, muscular dystrophy, tumor progression and metastasis) [2, 14]. Therefore, cysteine cathepsins are considered as promising drug targets for the effective treatment of many diseases [16].

Cysteine proteases produced by microorganisms do not only contribute to general protein turnover and nutrient processing, but often constitute important virulence factors during host invasion [17]. Upon infection, many bacteria secrete these enzymes in order to degrade main components of the extracellular matrix (e.g., collagen, elastin, fibronectin) and, thus, to infiltrate host tissues [1820]. Bacterial cysteine proteases also facilitate pathogen adhesion to other microbial and host epithelial cells, leading to biofilm formation and effective host colonization [20]. Several phyto- and zoopathogenic bacteria (e.g., Pseudomonas syringae, Yersinia spp.) deliver papain-like effector proteases into host cells via the type III secretion system to modulate host immune responses, such as the hypersensitive response in plants or macrophage-mediated phagocytosis in animals [21, 22]. Parasitic protists accumulate cysteine proteases in various cellular compartments, like vacuoles, endosomes, lysosomes, cytoplasmic vesicles and granules, the cell membrane and internal membranes, and the Golgi apparatus. The vast majority of these proteases exhibit a cathepsin-like structure and exert numerous pathological effects on the host, including: hydrolysis of the extracellular matrix, hemoglobin, transferrin and immunoglobulins, release of proinflammatory cytokins and kinins, activation or inactivation of the complement system, degradation of the mucus layer, destruction of the colonic epithelium with its accompanying tight junctions, macrophage infection and traversal of the blood–brain barrier [2326]. Some immunogenic microbial cysteine proteases have been suggested for implementation as diagnostic markers for parasitic diseases or conserved antigens in vaccine formulation [17]. Many of them have become drug targets for disease treatment (e.g., gingipain of Porphyromonas gingivalis, falcipain of Plasmodium falciparum and cruzipain of Trypanosoma cruzi for the treatment of periodontitis, malaria and Chagas’ disease, respectively) [20, 25, 26].

The activity of cysteine proteases is often controlled within a cell by their endogenous inhibitors in order to maintain physiological levels of proteolysis. However, such a protective action is not the only attribute of the inhibitors; the differences in their structure, specificity, affinity and distribution indicate much more complex roles of these molecules, which have also been proven to interact with exogenous peptidases produced by other species [27]. Microbial inhibitors of cysteine proteases, along with their target endogenous enzymes, may directly affect the host’s defense mechanisms and promote infection [28]. Present review initially describes structures and functions of cysteine protease inhibitors, including microorganism-derived inhibitors. Furthermore, the overview of cysteine protease inhibitors and their microbial producers is given. Subsequently, insight into the applications of microbial inhibitors in science, medicine and biotechnology is also provided.

Diversity of cysteine protease inhibitors

According to the MEROPS database [3], there are 20 families of the proteinaceous inhibitors of cysteine proteases, which belong to different clans and have representatives in viruses, microorganisms, plants and animals. These molecules are predominantly tight-binding and reversible inhibitors [29].

The family I25 of the clan IH comprises the inhibitors named cystatins. They act primarily on cysteine proteases of the papain family. Based on their size, sequence homology, post-translational modifications and distribution, cystatins have been subdivided into three types: stefins, cystatins and kininogens. Stefins (type 1 cystatins of the subfamily I25A) are mainly intracellular, single-chain polypeptides of about 100 amino acid residues (molecular mass about 11 kDa), lacking a signal peptide, disulfide bridges and carbohydrate groups. Cystatins (type 2 cystatins of the subfamily I25B) are also single-chain polypeptides of about 120 amino acid residues (molecular mass about 13–14 kDa) with two conserved disulfide bridges, generally non-glycosylated and synthesized with a signal peptide for extracellular trafficking, thus found in most body fluids. Kininogens (type 3 cystatins of the subfamily I25C) are large blood plasma glycoproteins (molecular mass about 60–120 kDa) with three tandemly repeated type 2-like cystatin domains, two of which inhibit papain-like enzymes [6, 27, 29]. The structural analysis of several crystallized cystatin–protease complexes revealed the exact mechanism of cystatin-induced inhibition; it involves the insertion of the cystatin’s wedge-shaped edge, consisting of two hairpin loops and the amino terminus, into the enzyme’s active site cleft, which then becomes inaccessible to substrates [30, 31]. The primary function of cystatins is to regulate the activity of endogenous cysteine proteases. In animals, the inhibitors contribute to many processes not always related to their inhibitory properties, such as: cell proliferation and differentiation, inflammation, immune response, angiogenesis, either suppression or promotion of tumorigenesis and metastasis [27, 32, 33]. Plant cystatins, named phytocystatins, may confer resistance to pathogen attack and support plant defense against stress agents, such as drought, salinity, oxidation, cold and heat shock [34].

Thyropins (family I31, clan IX) contain at least one cysteine-rich domain, which shares no sequence homology with cystatins, but is significantly homologous to the conserved thyroglobulin type-1 domain [27, 29]. Thyropins comprise functionally unrelated proteins, including: insulin-like growth factor-binding proteins (IGFBPs), MHC-II-associated p41 fragment of the invariant chain (p41Ii), equistatin, chum salmon egg cysteine protease inhibitor (ECI) and saxiphilin. IGFBPs, p41Ii and ECI contain only one thyroglobulin type-1 domain, whereas saxiphilin and equistatin contain two and three such domains, respectively [27]. Thyropins, similar to cystatins, inhibit mostly cysteine proteases of the papain family. However, a few of them also influence the activity of aspartic proteases and metalloproteinases [35]. For instance, the first thyroglobulin type-1 domain of equistatin inhibits papain-like peptidases, while the second domain may simultaneously inhibit aspartic cathepsin D. In p41Ii, the single thyroglobulin type-1 domain possesses high affinity and selectivity toward antigen-processing cysteine cathepsin L. Therefore, p41Ii has a putative role in the control of antigen presentation by preventing the excessive cleavage of antigenic epitopes [27].

Propeptides (family I29, clan JF) are the integral, N-terminal parts of the precursors of papain-like cysteine proteases. They assist in the proper folding of the proenzymes, protect them from premature activation and direct to the lysosomal compartment [36]. Upon zymogen maturation, its propeptide region is removed, releasing an active form of the protease [37]. Synthetic propeptides have been shown to selectively inhibit their cognate enzymes in vitro with high efficacy [38]. The C-terminal portion of a propeptide competitively binds to the enzyme’s active site cleft in an opposite orientation to that of a substrate [39].

The proteinaceous inhibitors of cysteine proteases vary greatly in terms of their specificity. Aforementioned cystatins, thyropins and propeptides interact mostly with peptidases of the family C1 (clan CA), but members of the other inhibitor families may selectively affect the activity of distinct non-papain-family enzymes. For instance, human calpastatins (family I27, clan II) inhibit calcium-dependent calpains (family C2, clan CA) [40], and the viral protein p35 (family I50, clan IQ) inhibits proapoptotic caspases (family C14, clan CD) [41]. Additionally, several inhibitors of serine proteases (e.g., serpins and soybean Kunitz trypsin inhibitor) can also inactivate cysteine proteases in cross-class inhibition [42, 43]. Noteworthy, many other proteins revealing the potency against cysteine proteases are not yet assigned to any inhibitor family, e.g., β-lactoglobulin from bovine milk whey [44] or a series of developed cathepsin-specific antibodies [45].

The inhibitors produced by microscopic living entities constitute a large number of antiproteolytic agents. Indeed, cysteine protease inhibitors belonging to 16 out of 20 families are distributed among microorganisms, including viruses. Moreover, several inhibitor families (I50, I57, I58, I69, I79, I81 and I91) comprise the proteins encoded exclusively in microbial or viral genomes [3]. Some of the inhibitors (e.g., α2-macroglobulin, thyropins and propeptides) are found predominantly in higher eukaryotes, where they interact with endogenous proteases. However, their homologues may also occur in microorganisms, probably as the effect of horizontal gene transfer between ecologically related species, such as humans and pathogenic or commensal bacteria. The acquisition of exogenous inhibitor-encoding genes by microbes may confer them improved adaptation to the host environment [46]. In general, microorganisms synthesize cysteine protease inhibitors for several purposes. Firstly, the intracellular inhibitors may regulate the activity of endogenous proteases in order to prevent the excessive proteolysis, which would be harmful to a microbial cell (Fig. 1a) [47]. Secondly, the inhibitors secreted by a microbe may hinder the activity of exogenous proteases produced by the host as defense factors against infections (Fig. 1b) [48]. Ultimately, microbial inhibitors may restrain both innate and adaptive immune responses of the host [49], as depicted in Fig. 2. Different inhibitors vary in size, structure, specificity, binding mechanism, distribution pattern and functional significance for their producers. Multiplicity and abundance of cysteine protease microbial inhibitors are described in the next chapter.

Fig. 1
figure 1

Possible roles of cysteine protease microbial inhibitors in host invasion by pathogenic microorganisms. a Intracellular inhibitors of microbial endogenous cysteine proteases protect microorganisms from premature activation of the proteases, which are destined for secretion to digest host proteins. b Secreted microbial inhibitors of exogenous cysteine proteases prevent the elimination of microorganisms by the host’s proteolytic defense system

Fig. 2
figure 2

Possible roles of cysteine protease microbial inhibitors in the host’s immune system evasion by pathogenic microorganisms. a Inhibitors secreted by the microorganisms being endocytosed by the host’s innate immune cells suppress the lysosomal cathepsins involved in pathogen phagocytosis. b Inhibitors secreted by the microorganisms being endocytosed by the antigen-presenting cells suppress the lysosomal cathepsins responsible for antigen processing and MHC-II maturation (degradation of Ii), thus leading to the limitation of the host’s adaptive immune response (decreased production of microorganism-specific antibodies)

In contrast to proteinaceous inhibitors, which usually occupy the enzyme’s active site cleft [31] or even enclose its whole molecule to restrict substrate accessibility [50], small-molecule inhibitors interact specifically with the enzyme’s amino acid residues responsible for catalysis [29]. Hence, such inhibitors may suppress their target enzymes with very high efficacy [51]. The reactive moiety of a small-molecule inhibitor, named “warhead,” binds either reversibly or irreversibly to the enzyme’s catalytic residue, while other parts of the inhibitor may confer it specificity via selective interactions with the substrate-binding sites [52, 53]. There exists a variety of cysteine protease small-molecule inhibitors produced by living organisms or synthesized in chemical laboratories. They all block the protease’s reactive site through the electrophilic attack of a “warhead” on the thiol group of the catalytic cysteine. A number of reactive “warheads” have been identified or developed, including: epoxides, nitriles, cyclic ketones, vinyl sulfones, disulfides, hydrazones, oxapenams and azepanone analogues [29, 52]. Unlimited capabilities to design and produce small-molecule inhibitors of any cysteine protease have gained the attention of pioneering pharmaceutical companies, which then started the intense pursuit for discovering drugs to combat the diseases linked with the hyperactivity of cysteine proteases. Several such drugs have been synthesized so far and tested in clinical trials [16, 54]. For instance, odanacatib (developed by Merck & Co., Inc.) has proven to be a potent, selective and non-toxic inhibitor of human cathepsin K, successfully applied for the investigational treatment of osteoporosis and bone metastasis [55]. Cysteine protease small-molecule inhibitors were isolated from cellular extracts or culture media of numerous microorganisms. Some of them have been further developed or used as lead compounds to design new structures with the potential for application in different sectors of industry [16, 56]. Several microbial small-molecule inhibitors of cysteine proteases are also summarized in the following chapter.

Cysteine protease microbial inhibitors and their producers

Viruses

Viruses are microscopic infectious agents, which cannot be defined as living organisms since their reproduction fully depends on the biochemical machinery of a host cell. Therefore, viruses have developed multiple mechanisms enabling them to infect a host cell and reprogram its metabolism to provide the optimal environment for their survival and multiplication. Small viral genomes encode the proteins crucial for effective host colonization. These molecules encompass cysteine protease inhibitors.

Viral inhibitors of cysteine proteases are represented in six inhibitor families [3]. The majority of them target proapoptotic caspases, thereby allowing the infected host cells to survive long enough to synthesize new viral particles. Studies on the baculovirus Autographa californica and its insect host Spodoptera frugiperda have led to the identification of a specific viral gene product, p35, which is responsible for blocking the apoptotic response of virus-infected insect cells [57]. Further analysis proved that p35 inhibits human caspases 1, 3, 6, 7, 8 and 10 with high efficacy [58]. The crystal structure of p35 in complex with caspase-8 revealed the mechanism of inhibition (Fig. 3a); p35 is an irreversible suicide inhibitor, which serves as a substrate analogue for its target enzyme. Upon caspase-induced cleavage of p35 reactive site peptide bond, the inhibitor undergoes conformational changes to form a stable thioester intermediate with its N-terminus inserted into the enzyme’s active site cleft [41]. The p35-encoding gene is considered as a promising tool in the molecular research on apoptosis or the gene therapy for excessive apoptosis-driven diseases (more details in chapter 4).

Fig. 3
figure 3

Structures of cysteine proteases complexed with proteinaceous (ad) and small-molecule (e, f) microbial inhibitors. The highlighted amino acid residues have their side chains presented as ball-and-stick models. a Caspase-8 (navy blue) complexed with p35 (red); Protein Data Bank (PDB) code: 1I4E. The active site residues of caspase-8 are as follows: His317 (green) and Cys360 (yellow). Upon caspase-induced cleavage of Asp87 (pink) at the apex of p35 reactive site loop, the inhibitor undergoes a major conformational change and inserts its N-terminus: Cys2 and Val3 (orange) into the caspase’s active site to exclude solvent accessibility of His317 residue in the catalytic dyad [41]. b Staphopain B (navy blue) complexed with staphostatin B (red); PDB code: 1Y4H. The active site residues of staphopain B are as follows: Gln237 (cyan), Cys243 (yellow), His340 (green) and Asn360 (orange). The binding loop of staphostatin B spans through both non-prime and prime substrate-binding sites of staphopain B. The loop is not hydrolyzed by the enzyme as it contains Gly98-Thr99 reactive site peptide bond; Gly98 (pink) and Thr99 (purple) are in the P1 and P1′ positions, respectively. Gly98 allows a unique conformation not possible for other amino acid residues, which makes the carbonyl oxygen of the peptide bond point away from the oxyanion hole formed by catalytic Gln237 and Cys243, thus preventing cleavage of Gly98-Thr99 [72]. c Cathepsin V (navy blue) complexed with clitocypin (red); PDB code: 3H6S. The active site residues of cathepsin V are as follows: Gln19 (cyan), Cys25 (yellow), His164 (green) and Asn188 (orange). The structure of clitocypin is reminiscent of a tree with a trunk composed of a six-stranded β-barrel, a crown made of two layers of loops and a root containing two additional loops. The two loops belonging to the lower layer of crown loops form a wedge-shaped edge, which fills the cathepsin’s active site cleft along its whole length, with the first lasso-like loop bound into the non-prime, and the second shorter loop bound into the prime substrate-binding site. Both loops attach clitocypin to cathepsin V through distinct hydrogen bonds: one bond between the side chain amide of the first loop’s Asn18 (pink, on the right) and the side chain of the cathepsin’s Gln63 (purple, on the right), and the other bond between the carbonyl of the second loop’s Ser42 (pink, on the left) and the side chain amide of the cathepsin’s Gln145 (purple, on the left) [110]. d Falcipain-2 (navy blue) complexed with chagasin (red); PDB code: 2OUL. The active site residues of falcipain-2 are as follows: Gln19 (cyan), Cys25 (yellow), His157 (green) and Asn187 (orange). Three CDR-like loops of chagasin form a wedge, which is well aligned to the falcipain’s active site cleft. The protease binding loops are as follows: the BC loop (pink) homologous to the motif in human CD8α and responsible for binding to the falcipain’s catalytic cysteine, the highly mobile DE loop (purple) and the FG loop (gray) [143]. e Papain (navy blue) complexed with leupeptin (presented as a stick model); PDB code: 1POP. The active site residues of papain are as follows: Gln19 (cyan), Cys25 (yellow), His159 (green) and Asn175 (orange). Leupeptin is composed of N-acetyl (purple), two leucyl (pink) and argininal (red) residues. The electrophilic carbon atom of the leupeptin’s arginine aldehyde is covalently bound by the sulfur atom of the papain’s catalytic Cys25 (the two interacting atoms are indicated by a black arrow). The oxygen atom of the leupeptin’s aldehyde faces the oxyanion hole and makes hydrogen bonds with Gln19 and Cys25 [92]. f Cathepsin K (navy blue) complexed with E-64 (red, presented as a stick model); PDB code: 1ATK. The active site residues of cathepsin K are as follows: Gln19 (cyan), Cys25 (yellow), His162 (green) and Asn182 (orange). After the opening of the epoxide ring of E-64, the inhibitor’s electrophilic carbon atom C2 is covalently bound by the sulfur atom of the cathepsin’s catalytic Cys25 (indicated by a black arrow) [127]. All structure figures were generated from Protein Data Bank Japan (PDBj). The catalytic amino acid residues were specified in accordance with the MEROPS database [3]. Color in the online version

It has been shown that a functional mutation in the p35 gene of A. californica can be complemented by the action of proteins encoded in the genomes of other baculoviruses. These proteins, named inhibitors of apoptosis (IAPs), constitute a family of molecules, exhibiting no significant homology to p35 and containing at least one zinc finger-like motif. The IAP genes have primarily been identified in the genomes of Cydia pomonella granulosis virus [59] and Orgyia pseudotsuga nuclear polyhedrosis virus [60]. However, the occurrence of their homologues is not restricted to viruses as they can also be found in fungi, protists and animals, including humans [3]. IAPs prevent apoptosis by inhibiting both initiator and executor caspases. The mechanism of inhibition depends on a type of targeted caspases, thus IAPs may interact either with the executor caspase’s active site or with the initiator caspase’s binding site responsible for enzyme homodimerization and activation [27].

The cysteine protease inhibitors affecting apoptotic pathways are also encoded in the genomes of several poxviruses. The vaccinia virus-encoded F1L protein, which contains the Bcl-2 homology domains, suppresses proapoptotic proteins of the Bcl-2 family (Bak and Bax) and inhibits the executor caspase-9, thereby neutralizing two sequential steps in the intrinsic (mitochondrial) apoptotic pathway [61]. F1L localizes to the mitochondria through a C-terminal hydrophobic domain and specifically inhibits caspase-9 by binding to its active site in a reverse orientation than the substrate [62, 63]. The serpin CrmA (cytokine response modifier A), encoded by the cowpox virus, inhibits the caspases, which either initiate the extrinsic (death receptor-mediated) apoptotic pathway (caspases 8 and 10) or activate proinflammatory cytokines (caspase-1), thus allowing the virus to avoid apoptotic and inflammatory responses of infected host cells. CrmA shares a general structural homology with serpins, even though it lacks substantial parts of the domains present in other serpins. Such a cleaved form of the viral serpin represents the economic strategy of the virus to maintain only the sequences essential for the inhibitor’s activity and integrity [64]. Similar to other serpins, CrmA is an irreversible suicide inhibitor, which forms a stable acyl enzyme intermediate through active site distortion [65]. The results of several experiments have shown that the CrmA-encoding gene may prove advantageous for gene therapy (see chapter 4).

Viruses may also encode the inhibitors targeting cysteine proteases from the papain family. Such inhibitors comprise the homologues of propeptides and cystatins [3]. The latter are represented in the genome of a bracovirus involved in host–parasite interaction. The parasitoid wasp Cotesia congregata injects its eggs together with the bracovirus particles to the hemolymph of the host caterpillar Manduca sexta. Upon infection, the viral genes are highly expressed, including the multigene family members, which encode three cystatins homologous to the type 2 cystatins of the subfamily I25B. As a result, the caterpillar’s immune response is diminished, allowing the parasitoid wasp eggs to survive and develop into adult organisms inside their host. This is one of a few known examples of mutualism between viruses and eukaryotes [66]. Evolution of the viral cystatin multigene family is driven by a strong positive selection, implying that the bracovirus cystatins coevolve with their target host cysteine proteases [67].

Bacteria

Bacterial proteinaceous inhibitors of cysteine proteases belong to ten inhibitor families. Several families are represented solely by the inhibitors produced only by bacteria, and they comprise staphostatins and the streptopain inhibitor [3]. Staphostatins have been identified in Staphylococcus aureus and other coagulase-negative staphylococci [68, 69]. They are synthesized as intracellular proteins, which form β-barrels and structurally resemble lipocalins but not cystatins, despite cystatin-like size of 105–108 amino acid residues [69, 70]. Staphostatins are endogenous inhibitors of the most intensively secreted cysteine proteases of S. aureus named staphopains [70]. Staphopains are folded in a papain-like manner and are suggested to play a role in S. aureus pathophysiology, although there is still a lack of convincing in vivo data on their function as virulence factors. On the other hand, the results of in vitro experiments demonstrate a number of pathological roles for these proteases, including destruction of connective tissue, disturbance of plasma clotting, induction of septic shock, interactions with host immune cells and modulation of S. aureus biofilm architecture [47, 71]. Staphostatins A and B display extremely high selectivity toward their target enzymes, staphopains A and B, respectively, with no cross-inhibition of the other cysteine proteases being observed [69]. Staphostatins are reversible, competitive and tight-binding inhibitors, which form stable and non-covalent complexes with staphopains [69, 72]. The polypeptide chain of staphostatin spans the staphopain’s active site cleft in the forward direction. The conserved glycine residue in the staphostatin’s reactive site prevents cleavage of the inhibitor by staphopain (Fig. 3b); substitutions of the glycine residue result in a loss of the inhibitor affinity for the protease and convert the inhibitor into a substrate [70]. Similar to staphostatins, the streptopain inhibitor from Streptococcus pyogenes targets exclusively one endogenous cysteine protease destined for secretion. The streptopain inhibitor is homologous to the streptopain propeptide, thus both of them are presumed to inactivate the enzyme in a corresponding manner [73]. Staphostatins, as well as the streptopain inhibitor, are encoded in the same operon as their target enzymes [69, 73]. Therefore, the inhibitors may effectively regulate the intracellular proteolytic activity and protect cytoplasmic molecules from hydrolysis by prematurely activated or misdirected bacterial cysteine proteases [74].

Bacteria also produce the cysteine protease inhibitors, exhibiting much broader specificity than those described above. α2-Macroglobulins interact with a wide variety of endopeptidases, regardless of catalytic type, by enclosing the whole peptidase molecules to restrict substrate accessibility in a molecular size-dependent manner, with very large substrates being completely excluded from the access to the enzyme’s active site [75]. Due to the homology with some major components of the complement system, such as proteins C3, C4 and C5, and a role in the clearance of microbial endopeptidases from the plasma, metazoan α2-macroglobulins are considered as an important part of the host’s innate immunity [7678]. The acquisition of metazoan α2-macroglobulin-encoding genes by bacteria via horizontal gene transfer may have facilitated the adaptation of these microorganisms to the host environment. Indeed, bacterial α2-macroglobulin homologues can block the host proteases involved in antimicrobial defenses, thus functioning in reverse to their metazoan counterparts [46, 78]. Bacterial α2-macroglobulin genes are found in both pathogenic and saprophytic species of diverse clades, including bacteroidetes, cyanobacteria, deinococcids, fusobacteria, planctomycetes, proteobacteria, spirochetes and thermotogae [46]. The genes encoding more specific inhibitors of cysteine proteases have also been acquired by bacteria from eukaryotes. Such genes may code for the homologues of higher eukaryotic inhibitors (cystatins, thyropins, propeptides and serpins) or parasitic inhibitors (chagasin and falstatin—described later in this chapter) [3]. The distribution of bacterial cystatin homologues is very patchy, with members of the cystatin subfamily being limited to bacteroidetes and proteobacteria (mainly to the genus Vibrio), and members of the stefin subfamily found in actinobacteria, bacteroidetes, chlorobi, cyanobacteria, firmicutes, fusobacteria, proteobacteria and spirochetes [28]. The homologues of chagasin, a protein discovered in Trypanosoma cruzi, are encoded in every sequenced representative of the genus Pseudomonas, as well as in several other bacteria [79]. A recombinant form of the chagasin homologue from Pseudomonas aeruginosa proved to be a potent reversible inhibitor of mammalian cathepsin L and protozoan cysteine peptidase [80].

Several proteinaceous inhibitors of cysteine proteases, which can be found in bacteria, have not yet been assigned to any inhibitor family. They include the transglutaminase substrate from Streptomyces mobaraensis [81]. Transglutaminases are widespread in nature and occur in animals, plants, bacteria, archaea and fungi [82]. These enzymes catalyze calcium-dependent acyl-transfer reactions between the γ-carboxamide group of glutamine and either the ε-amino group of lysine or the primary amino group of polyamine residues to form the isopeptide bonds, which enable the intra- or intermolecular cross-linking in peptides and proteins. The resulting supramolecular protein aggregates are resistant to chemical, enzymatic and mechanical disruption [83]. Transglutaminases play multiple physiological roles, being involved in blood coagulation, skin barrier formation, apoptosis, extracellular matrix organization, vascular remodeling and many other processes [84, 85]. On the other hand, the enzymes are implicated in the pathogenesis of celiac disease, fibrosis and neurodegenerative disorders, thus making them important therapeutic targets [86, 87]. Bacterial transglutaminases may constitute virulence factors, which impede the phagocytosis of pathogenic bacteria in host blood [88]. Bacteria produce a variety of substrates for their transglutaminases. These substrates may exhibit the inhibitory effect on proteolytic enzymes, as is the case for several proteins from S. mobaraensis. The bacterium secretes the heat-resistant transglutaminase substrate, with molecular mass of 12 kDa, which inhibits the cysteine proteases papain and bromelain, and the serine protease trypsin. The substrate has been named Streptomyces papain inhibitor (SPI) as papain was the most susceptible to inhibition among all tested proteases. SPI and other S. mobaraensis transglutaminase substrates with antiproteolytic potential are therefore suggested to play a role in the defense of protein aggregates, stabilized by bacterial transglutaminases, against host proteases [81]. More recently, it has been discovered that SPI also affects the activity of bacterial cysteine proteases, such as staphopain B, and inhibits the growth of Bacillus anthracis, Pseudomonas aeruginosa, Staphylococcus aureus and Vibrio cholerae, thus revealing the possibility of its application for the treatment of diseases caused by unrelated pathogenic bacteria [89].

The molecular nature of cysteine protease bacterial inhibitors is not limited to proteinaceous compounds. Bacteria also produce a multitude of small-molecule inhibitors; one of them is leupeptin, secreted extracellularly by various species of actinomycetes. Leupeptin is a tripeptidyl aldehyde, which occurs in the form of acetyl- or propionyl-L-leucyl-L-leucylargininal [90]. It binds covalently to the catalytic cysteine and serine residues of cysteine and serine proteases, respectively, acting as their reversible and competitive inhibitor [90, 91]. The structural analysis of leupeptin in complex with papain showed that the carbon atom of the inhibitor’s aldehyde group is covalently bound by the sulfur atom of the nucleophilic thiol group in the protease’s catalytic cysteine (Fig. 3e) [92]. Leupeptin seems to play an important role in the control of morphological differentiation of Streptomyces exfoliatus. The bacterium produces leupeptin during exponential growth, allowing the inhibitor to block the activity of endogenous trypsin-like protease required for the formation of aerial mycelia. At later growth stages, leupeptin is hydrolytically inactivated by a metalloproteinase-like enzyme, leading to the reactivation of trypsin-like protease and, consequently, the development of aerial mycelia [93, 94]. A number of leupeptin analogues have been synthesized in order to improve inhibitor selectivity toward specific proteases. Different amino acid substitutions in the sequence of the tripeptidyl aldehyde resulted in the discovery of potent and selective inhibitors of papain [95], cathepsin B [56] and cathepsin L [96], among others. Leupeptin and its synthetic derivatives have been used in studies on diverse pathological conditions. Moreover, they are promising candidates for application in the targeted therapies to treat cysteine cathepsin-related diseases. Another peptidyl aldehyde, antipain, is also secreted by actinomycetes and occurs to be more specific for papain and trypsin compared to leupeptin [97]. The cyanobacterium Anabaena circinalis produces circinamide, a peptidyl epoxysuccinyl-based compound, with stronger inhibitory activity against papain than that of leupeptin [98]. Some non-peptidyl small-molecule inhibitors of cysteine proteases have been isolated from a marine Pseudomonas strain and identified as aryl diesters, namely dibutyl phthalate and di-(2-ethylhexyl) phthalate. They both proved to be tight-binding, reversible and non-competitive inhibitors of cathepsin B. It is noteworthy that the same phthalates, manufactured synthetically, are widely used as plasticizers in many industrial products [99].

There are a growing number of scientific reports on the inhibitory potential of diverse bacterial strains on cysteine proteases. The presence of cysteine protease inhibitors has been revealed in conditioned media, crude cell extracts and periplasmic extracts of different clinical and environmental strains. For instance, the inhibitory activity against mammalian cathepsins was observed in Plesiomonas shigelloides [100] or actinomycetes and sphingomonads associated with Caribbean sponges [101]. However, the inhibitors have not been purified and identified so far.

Archaea

The domain Archaea comprises the prokaryotic microorganisms, which are commonly viewed as extremophiles living in harsh environments, such as hot acid springs, salt brines and the ocean depths. However, archaea have also been found in a broad range of less hostile biotopes, including terrestrial soils, lakes, marshlands, marine plankton and freshwater sediments. Additionally, they may be associated with metazoan organisms, predominantly as their mutualists or commensals [102]. Some methanogenic archaeal species inhabit the human microbiome at different sites, e.g., oral cavity, intestine or vagina [103]. Therefore, it seems reasonable to assume that archaea may produce cysteine proteases and their inhibitors for better adaptation to the host environment. Unfortunately, data on archaeal inhibitors of cysteine proteases are obscure. No such inhibitors have been isolated and characterized so far. Nevertheless, the genes coding for the homologues of serpins, α2-macroglobulins and chagasin have been identified in the genomes of several archaeal species. Methanococcoides burtonii, a methylotrophic methanogenic archaeon, discovered in Antarctic lake, has been so far the only archaeal species known to encode the representatives of all three aforementioned inhibitor families [3]. Recombinant forms of the serpin homologues from the hyperthermophilic archaea Pyrobaculum aerophilum (aeropin) and Thermococcus kodakaraensis (Tk-serpin) have been characterized, though not tested on cysteine proteases. Both serpin-like proteins are resistant to thermal denaturation [104, 105]. The unique property of Tk-serpin is that its inhibitory activity against several serine proteases increases with temperature up to 100 °C [105].

Fungi

Fungi, including both filamentous and unicellular forms, are the source of much wider spectrum of proteases and other enzymes than bacteria. They produce acidic, neutral and alkaline proteases with broad substrate specificity, predominantly secreted for different purposes, such as nutrient acquisition, unfavorable environment adaptation and symbiotic or antagonistic interactions with other organisms [106, 107]. The pool of secreted proteases consists mostly of aspartic, serine proteases and metalloproteinases. Therefore, extracellular cysteine proteases do not seem to play any important roles in fungal pathophysiology [107, 108]. On the other hand, a substantial number of fungal species encode cysteine proteases, some of which have already been characterized. Fungal inhibitors of cysteine proteases are also encoded in many genomes and represented in seven inhibitor families [3].

Clitocypin is one of such inhibitors. It has originally been isolated from the fruit bodies of the edible fungus Lepista nebularis (formerly Clitocybe nebularis) [109], but is also encoded in the genome of the filamentous phytopathogenic fungus Rhizoctonia solani [3]. This monomeric protein, with molecular mass of 16.8 kDa, forms a non-covalent homodimer, which inhibits peptidases of the family C1: papain, bromelain, cathepsins B and L, but not cathepsin H. The serine protease trypsin and the aspartic protease pepsin are unaffected. Clitocypin does not possess cysteine or methionine residues, thus not being able to form any disulfide bonds, in contrast to many other inhibitors [109]. Its tertiary structure has been solved, revealing a β-trefoil fold similar to that of the Kunitz-type serine protease inhibitors. However, unlike the latter, clitocypin binds to the target protease along the whole active site cleft in a cystatin-like manner. It forms a wedge made of loops, which restrict substrate accessibility by occluding the active site cysteine (Fig. 3c) [110]. Macrocypins, the inhibitors belonging to another family and isolated from the higher fungus Macrolepiota procera [111], resemble clitocypin in terms of their tertiary structure and enzyme binding geometry [110], but the homologues have not been found in any filamentous or unicellular fungi [3]. Both clitocypin and macrocypin homologues, collectively named mycocypins, may function as intracellular controllers of the activity of endogenous proteases, or as virulence factors of some pathogenic fungi [110, 111].

Several phytopathogenic fungi secrete cysteine-rich effector proteins, including diverse enzymes, protease inhibitors, toxins or other factors (e.g., chitin-binding proteins protecting fungal cell walls against plant chitinases), which facilitate host colonization [112]. The biotrophic fungus Passalora fulva (formerly Cladosporium fulvum) secretes the avirulence effector protein 2 (Avr2) into the apoplast of tomato leaves, where it inhibits host protective cysteine peptidases, thus promoting plant apoplast infection [113, 114]. The mature form of Avr2 consists of 58 amino acid residues with 8 cysteine residues forming four disulfide bridges, three of which provide a stable structure [115]. Avr2 inhibits several tomato papain-like cysteine proteases, binding to Rcr3 (uncompetitively) and Pip1 with the highest affinity, and to TDI65 and aleurain with lower affinity [114116]. Its virulence is demonstrated by enhanced susceptibility of Avr2-expressing tomato and Arabidopsis thaliana toward extracellular fungal pathogens [114]. Overcoming the plant’s proteolytic defense system by fungal effector proteins triggers another defense system in plants, which is based on effector perception by the plant’s cognate resistance proteins. Indeed, one of these proteins, the extracellular leucine-rich repeat receptor-like protein (Cf-2), is secreted into the apoplast and initiates the hypersensitive response only in the presence of the Avr2 effector complexed with the Rcr3 peptidase [113, 115]. Furthermore, pathogenic fungi might possibly defend themselves against the Cf-2-mediated plant hypersensitive response by producing truncated variants of Avr2 [117].

The distribution of cystatin homologues among fungi is strikingly limited. In fact, only one fungal cystatin-like protein has been characterized to date after its purification from the conditioned medium of the pathogenic fungus Candida albicans [118]. This dimorphic microorganism is the most common cause of fatal fungal infections in humans [119]. It produces diverse proteases, which, together with the cystatin-like inhibitor, may be involved in the host invasion. The fungal cystatin homologue has been identified as a heat- and pH-stable protein with molecular mass of 15 kDa and the N-terminal sequence similar to that of human cystatin A. It proved to reversibly and non-competitively bind to papain. The inhibitor’s secretion level was much higher for the yeast form compared to the hyphal form of C. albicans [118]. Moreover, the genes encoding proteins homologous to serpins, falstatins and IAPs have been found in the genomes of different fungal species [3]. For instance, two yeast species, Saccharomyces cerevisiae and Schizosaccharomyces pombe, encode a single IAP designated BIR1 and bir1, respectively. Yeast IAP molecules are required for cell division and cytokinesis, though their effect on cysteine proteases has not been investigated [120]. The extra- and intracellular inhibitory potential of different yeast strains of the Saccharomycetaceae family on papain has also been shown [121].

Cysteine protease small-molecule inhibitors are secreted by several fungal species. A peptidyl epoxide, namely 1-[N-[(L-3-trans-carboxyoxirane-2-carbonyl)-L-leucyl]amino]-4-guanidinobutane, commonly known as E-64, has been purified from the conditioned medium of the soil fungus Aspergillus japonicus [122]. E-64 inactivates cysteine proteases and does not affect proteolytic enzymes of other catalytic types. It covalently and irreversibly binds to the peptidases of the clan CA and inhibits them in a time- and dose-dependent manner [123]. However, it does not interact with members of the clan CD, such as caspases and legumains [124]. The X-ray crystal structures of E-64 complexed with target cysteine proteases have shown that the electrophilic carbon atom of the inhibitor’s epoxide ring covalently binds to the sulfur atom of the thiol group in the protease’s catalytic cysteine, forming a stable thioether (Fig. 3f) [125127]. Chemical modifications of E-64 have led to the development of different epoxysuccinyl-based compounds, which exhibit selectivity toward specific cysteine cathepsins [53, 128] and possess the ability to penetrate cell membranes in contrast to the original fungal inhibitor [129]. E-64 and its selected derivatives are listed in Table 1, with their applications being briefly described in the next chapter.

Table 1 Development of epoxysuccinyl-based inhibitors of cysteine proteases

Besides, not only E-64, but also other peptidyl epoxides have been isolated from fungal species, such as the selective inhibitors of cathepsins B and L produced by Colletotrichum sp. [136] and Aphanoascus fulvescens [137]. Epoxysuccinyl derivatives have been identified in the culture filtrate of the soil fungus Myceliophthora thermophila and named estatins. They specifically inhibit proteases of the papain family and suppress the production of an allergen-specific immunoglobulin E in vivo [138]. Cathestatins, decarbamidoyl analogues of estatins, have been discovered as secondary metabolites of Penicillium citrinum and Microascus longirostris associated with marine sponges. The inhibitors exhibit high potency against cathepsin L and, to a lesser extent, cathepsin B and other members of the papain family [139, 140]. Furthermore, cathestatin B was shown to restrain bone collagen degradation in vitro [139].

Protists

A large number of parasitic protists produce cysteine proteases, belonging mostly to the papain family, which are pivotal for the regulation of the parasite’s life cycle. These enzymes also constitute virulence factors, which enable host invasion via induction of different pathological processes described in the introduction. The synthesis of cysteine proteases in protists is accompanied by the production of endogenous cysteine protease inhibitors. The latter are often involved in the control of parasite development through interactions with their cognate enzymes. Many of them are homologous to the inhibitors occurring in the hosts of parasitic protists and modulate the activity of the host’s proteolytic defense system, thereby pointing to their roles in host infection. Protozoan inhibitors of cysteine proteases are grouped into eight inhibitor families [3].

Chagasin is the inhibitor discovered in the intracellular parasitic protist Trypanosoma cruzi [141], which causes Chagas’ disease, a chronic illness affecting the heart muscle and digestive system. T. cruzi is transmitted to mammals, including humans living in Latin America, by hematophagous triatomines (“kissing bugs”) and has several morphological forms: epimastigotes replicating in the intestinal tract of the insect vector, metacyclic trypomastigotes found in bug feces and passed on mammals, amastigotes multiplying in the mammalian cell cytoplasm, and trypomastigotes invading the mammalian bloodstream [26]. Chagasin is the physiological regulator of cruzipain, the cysteine cathepsin-like and major protease of T. cruzi [141]. Cruzipain is an essential virulence factor of the protist, not only involved in nutrient processing, but also triggering the release of proinflammatory kinins and inactivation of the complement system in a host. It is produced at all stages of the parasite’s life cycle and localizes to the lysosome, the flagellar pocket and the plasma membrane of epimastigotes and amastigotes [26], while the synthesis of chagasin is regulated developmentally and its localization limited to the flagellar pocket and cytoplasmic vesicles of trypomastigotes, and to the plasma membrane of amastigotes. Chagasin is a monomeric protein with molecular mass of 12 kDa [141]. Its tertiary structure shows eight β-strands arranged in two β-sheets, adopting an immunoglobulin-like fold [142]. Three loops connecting the β-strands constitute the counterpart of the complementarity determining regions (CDRs). The chagasin CDR loops are conserved in all chagasin homologues and form a tripartite wedge, which occludes the target protease’s active site cleft in a similar manner to that of cystatins and p41Ii (Fig. 3d) [143]. Chagasin does not only interact with its cognate enzyme cruzipain, but also effectively and reversibly inhibits other protozoan cysteine peptidases, as well as papain and human cathepsins B, H, K and L [141, 143]. The under- and overexpression of chagasin in T. cruzi mutants revealed the inhibitor’s importance during parasitic infections of mammalian cells [144]. The homologues of chagasin have been identified in Trypanosoma brucei (the causative agent of African trypanosomiasis, known as sleeping sickness in humans) and Leishmania spp. (responsible for the disease leishmaniasis) [79, 80]. Other inhibitors assigned to the chagasin family comprise amoebiasin from Entamoeba spp. [145] and cryptostatin from Cryptosporidium spp. [146].

Another intracellular parasitic protist Plasmodium falciparum produces the cysteine protease inhibitor named falstatin [147]. P. falciparum is exclusively transmitted by Anopheles spp. mosquitoes and causes malaria in humans living in tropical and subtropical regions of Africa [148]. The pathogen exists in several morphological forms, such as sporozoites and merozoites in the human bloodstream and liver, or rings, trophozoites, schizonts and gametocytes inside the human erythrocytes [149]. Falstatin is the endogenous inhibitor of falcipains, which belong to the papain family and constitute the major P. falciparum cysteine proteases involved in erythrocyte invasion and hemoglobin hydrolysis [25]. Falstatin is synthesized in merozoites, rings and schizonts, but not in trophozoites, in which the activity of falcipains is maximal [25, 147]. Falstatin is located at the periphery of rings and early schizonts, diffused in late schizonts and merozoites, and released upon the rupture of mature schizonts [147], while falcipains localize to the food vacuole, where the degradation of hemoglobin occurs [150]. Falstatin exhibits only partial sequence similarity to that of chagasin and is much larger, with molecular mass of approximately 47 kDa, though its tertiary structure has not yet been solved. It does not only inhibit falcipains with high efficacy, but also reversibly and competitively inactivates other cysteine peptidases from different Plasmodium species, as well as papain and human cathepsins H, K and L, calpain-1, caspase 3 and caspase 8. Studies on the biological activity of falstatin in P. falciparum indicate that the inhibitor may facilitate erythrocyte invasion through interactions with protozoan and host cysteine proteases [147].

The analysis of different protozoan genomes has led to the identification of genes encoding cysteine protease inhibitors, which belong to well-characterized and widespread families, such as serpins, cystatins, propeptides, IAPs and α2-macroglobulins. For instance, a murine cathepsin H propeptide-like protein is encoded in the genome of Paramecium tetraurelia, and a human cystatin B-like molecule—in the genome of Symbiodinium minutum [3]. A cystatin-like inhibitor has been identified in the parasitic protist Trichomonas vaginalis, the causative agent of trichomoniasis. The inhibitor has proven to interact with the endogenous cysteine protease TvCP39 involved in cytotoxicity, therefore, it was suggested to play a role in the regulation of the cellular damage caused by T. vaginalis [151]. The discovery of toxostatins, the endogenous inhibitors of cysteine proteases in the intracellular parasite Toxoplasma gondii [152], has resulted in the foundation of a new inhibitor family [3]. Toxostatins efficiently inhibit cathepsin-like proteases of T. gondii, but have no effect on pathogen invasion and its intracellular replication [152].

Studies on microbial inhibitors of cysteine proteases are very dynamic; new inhibitors are still being discovered and characterized. The multiplicity of microorganisms reflects a very large number of microbial inhibitors known to date. Selected cysteine protease inhibitors produced by different groups of microorganisms are summarized in Table 2 for better visualization of their diversity in terms of structure, binding mechanism and specificity.

Table 2 Selected microorganism-derived inhibitors of cysteine proteases (the structural formulas of small-molecule inhibitors are given)

Factual and possible applications of cysteine protease microbial inhibitors

The baculovirus protein p35, which inhibits proapoptotic caspases, has been used in many studies to elucidate the pathways of apoptosis in different types of cells under various conditions. In the anticancer research, p35 allowed to determine whether thapsigargin, an inhibitor of the endoplasmic reticulum-associated calcium-ATPase, induces apoptosis in breast cancer cells via activation of the caspase proteolytic cascade [153]. Applications of p35 in experimental therapies are also extensive. For instance, in the cytochrome P450 gene-directed enzyme prodrug therapy (P450 GDEPT), the P450-expressing tumor cells transduced with the p35-encoding gene exhibited transiently decreased sensitivity to the cytotoxic effect of the anticancer prodrug cyclophosphamide (CPA). Therefore, the production and release of P450-activated CPA metabolites by the P450-expressing tumor cells were prolonged, resulting in increased cytotoxicity toward the P450-deficient tumor cells present in the tumor milieu. This approach indicated that p35 may serve as an enhancer for P450 GDEPT and other similar anticancer strategies [154]. Chronic and acute human neurological diseases, such as amyotrophic lateral sclerosis (ALS), encephalopathy, cerebral ischemia or Huntington’s disease, are caused by apoptosis-mediated neuronal cell death. The expression of p35 in neurons undergoing neurotoxic changes may prevent their apoptosis, as was the case for human cerebral neurons, thus suggesting the therapeutic relevance of p35 for the treatment of human neurodegenerative diseases [155]. Besides, p35 has been found useful for therapies against diabetes, inflammatory arthritis, cardiovascular and ocular disorders, as well as for biotechnological purposes due to its roles in delaying plant senescence or endowing plant resistance to phytopathogens and abiotic stresses. All these applications of p35 are reviewed by Sahdev et al. [156].

Another inhibitor of caspases, the viral serpin CrmA, has been tested for its therapeutic potential and proved to rescue hepatic and renal cells from apoptosis in experimental gene therapies, thus preventing acute hepatitis [157] and nephrotoxicity [158], respectively. The CrmA gene transduction into grafted cells may eventually protect against xenograft rejection [159]. Furthermore, it was shown that the CrmA gene co-expressed in vivo with a transgene of interest may significantly prolong the expression of the latter by delaying the cytotoxic T lymphocyte-induced apoptosis of adenoviral vector-transduced host cells [160].

Leupeptin, a bacterial inhibitor of cysteine and serine peptidases, has multiple applications in basic and applied science. It is an ingredient of commercially available protease inhibitor cocktails, routinely used in many laboratories. Leupeptin and its analogues have been employed in the investigation of cysteine protease functions in different processes, such as macroautophagy [161], MHC-II maturation and trafficking [162] or cellular response to X-ray irradiation [163]. Leupeptin has also been used in the research on development of antimalarial agents and contributed to the identification of new drug targets in the P. falciparum strains resistant to inhibition [164]. Moreover, the treatment with leupeptin proved advantageous in several therapeutic approaches, exerting such effects as: improved motoneuron survival and muscle function after nerve injury [165], protection of inner ear hair cells from aminoglycoside ototoxicity [166], prevention of gentamicin-induced lysosomal phospholipidosis [167], suppression of gingivitis induced by P. gingivalis [168], heart protection from myocardial stunning [169] and inhibition of ventilation-induced diaphragmatic contractile dysfunction and atrophy [170].

The fungal epoxysuccinyl compound E-64, which selectively and irreversibly inhibits cysteine proteases, is commonly used as an ingredient of protease inhibitor cocktails. It has been applied in a variety of basic research, allowing the identification of cysteine proteases and characterization of their biological functions. E-64 binds to the target enzyme in an equimolar ratio, therefore, the molarity of an active cysteine protease can be determined by stoichiometric titration with this inhibitor [171]. Cell culture and animal model-based studies have shown that E-64 and its derivatives exhibit the potential to become effective drugs for the treatment of a number of pathological conditions associated with proteolysis dysregulation. In fact, the treatment with E-64 improved synaptic transmission and prevented memory loss in Alzheimer’s disease [172], and decreased in vitro invasion and in vivo metastasis of human melanoma cell lines [173]. The ethyl ester of E-64 (E-64d) was tested for the treatment of muscular dystrophy in humans, but its development was stopped in phase III clinical trials due to low efficacy and hepatotoxicity in rats [174]. CA-074, a derivative specific for cathepsin B, reduced invasion of metastatic human melanoma and inflammatory breast cancer cell lines [175, 176]. The intraperitoneal administration of this compound also suppressed bone metastasis in mice with mammary cancer [177]. CA-074 Me, a membrane-permeable analogue of CA-074, diminished in vivo levels of brain β-amyloid related to Alzheimer’s disease [178] and inhibited in vitro invasion of human esophageal squamous cell carcinoma cells [179]. JPM-OEt, a derivative selective for the papain family proteases, enhanced the effectiveness of chemotherapy in a mouse model of pancreatic islet cell carcinogenesis, contributing to tumor regression and increased overall survival [180]. Medical applications were also confirmed for the CLIK inhibitors, designed as the epoxysuccinyl compounds targeting individual cysteine cathepsins. For instance, CLIK-148, which selectively inhibits cathepsin L, was shown in the in vivo study to inhibit bone metastases and protect against malignant hypercalcemia in different models of cancer [181]. Another specific inhibitor of cathepsin L, CLIK-195, contributed to reduced body weight gain, decreased serum insulin levels and increased glucose tolerance in mice [182].

Parasitic protists often depend on their cysteine protease activity during invasion of the host cells. Therefore, the attempts to diminish the proteolytic potential of these pathogens may constitute an important step in the still ongoing fight against protozoan infections. T. cruzi produces cruzipain, the major cysteine protease involved in its pathogenicity, the activity of which is controlled intracellularly by the endogenous inhibitor chagasin. The overexpression of chagasin in T. cruzi was shown to slow down protist metacyclogenesis and decreased infectiveness of the parasite in vitro [144]. Trichocystatin-2, produced by T. vaginalis, is the endogenous inhibitor of the protist’s virulence factor, cysteine protease TvCP39. The treatment of T. vaginalis with a recombinant trichocystatin-2 resulted in reduced in vitro cytotoxicity of the parasite [151]. On the other hand, the cysteine protease inhibitors produced by some protists may be indispensable for maintaining their invasive properties. This is the case for P. falciparum, which produces falstatin, the endogenous inhibitor of falcipains. Both falstatin and falcipains facilitate parasitic infection, therefore, blocking falstatin with specific antibodies may decrease erythrocyte invasion by the protist, as shown in the in vitro experiment. This points to the possibility of targeting falstatin over the course of vaccine development to prevent malaria [147].

The inhibitors with a β-trefoil fold, in particular fungal mycocypins, are considered as promising drug candidates in transgenic trials aimed at crop protection. Their unique structures are composed of different loops, each of which may inhibit proteases from several classes. Such broadly reactive inhibitors are suggested to be more effective against phytopathogenic infections than one class-specific protease inhibitors. The latter have failed to protect plants from parasitic microbes and insects, because plant pathogens produce diverse proinvasive proteolytic enzymes, which cannot be simultaneously inactivated by such inhibitors [110].

Conclusions

Cysteine protease inhibitors are used by microorganisms for several purposes. They may control the activity of endogenous enzymes or be implicated in microbial pathogenicity through inhibition of the host proteases involved in the immunological defense against infections. Isolation and characterization of a number of such inhibitors from a variety of microorganisms have allowed to better understand their pathophysiology and contributed to development of different strategies against pathogen-related diseases. However, more research needs to be done in order to precisely elucidate the inhibitor-mediated interaction between microorganisms and their hosts.

In many organisms, including humans, the activity of endogenous cysteine proteases may be dysregulated, leading to the development of severe pathologies. This rises the need for targeting these enzymes with effective and selective inhibitors. A few microbial inhibitors have been successfully applied for the investigational treatment of cysteine cathepsin-driven diseases. Besides, bacteria and yeast are used routinely in science and biotechnology for efficient heterologous production of biologically active and therapeutically relevant proteins, which comprise recombinant cysteine protease inhibitors, such as human cystatins and cathepsin propeptides [183186]. Redesigning the inhibitor-encoding gene in accordance with the codon preference of bacterial or yeast genes may substantially increase the level of transgene expression, as observed for the production of recombinant human cystatin C by Escherichia coli [186] and Pichia pastoris [185]. Viruses, due to their ability to infect host cells with very high efficiency, have also been implemented in the experimental gene therapy targeting cysteine cathepsins, thus allowing for the local overexpression of human cystatin C in the treated host [187]. In conclusion, microorganisms contribute to the dynamic research on cysteine protease inhibitors in many different ways, as producers of both natural and recombinant inhibitors, with their new applications being expected to arise in the future.