BIOTECHNOLOGICALLY RELEVANT ENZYMES AND PROTEINS Identification of ThermotogamaritimaMSB8 GH57 α-amylase AmyC as a glycogen-branching enzyme with high hydrolytic activity

AmyC, a glycoside hydrolase family 57 (GH57) enzyme of Thermotoga maritima MSB8, has previously been identified as an intracellular α-amylase playing a role in either maltodextrin utilization or storage polysaccharide metabolism. However, the αamylase specificity of AmyC is questionable as extensive phylogenetic analysis of GH57 and tertiary structural comparison suggest that AmyC could actually be a glycogen-branching enzyme (GBE), a key enzyme in the biosynthesis of glycogen. This communication presents phylogenetic and biochemical evidence that AmyC is a GBE with a relatively high hydrolytic (αamylase) activity (up to 30% of the total activity), creating a branched α-glucan with 8.5% α-1,6-glycosidic bonds. The high hydrolytic activity is explained by the fact that AmyC has a considerably shorter catalytic loop (residues 213–220) not reaching the acceptor side. Secondly, in AmyC, the tryptophan residue (W 246) near the active site has its side chain buried in the protein interior, while the side chain is at the surface in Tk1436 and Tt1467 GBEs. The putative GBEs from three other Thermotogaceae, with very high sequence similarities to AmyC, were found to have the same structural elements as AmyC, suggesting that GH57 GBEs with relatively high hydrolytic activity may be widespread in nature.

They are ubiquitously present in all kingdoms of life. Wellknown GHs are α-amylase (EC 3.2.1.1); the enzyme present in, e.g. saliva and the small intestine, responsible for the degradation of starch; and lactase (E.C. 3.2.1.108), which degrades the milk sugar lactose to glucose and galactose. GHs are classified based on amino acid sequence homology in 152 different families (CAZy) (Cantarel et al. 2009;Lombard et al. 2014). Most GHs have either an inverting or a retaining reaction mechanism as outlined by Koshland (1953).
The crystal structure of AmyC (Dickmanns et al. 2006) showed structural similarity with PDB entry 1UFA, a GH57 enzyme (TT1467) with then unknown function. Santos et al. (2011) determined the crystal structure of another GH57 enzyme, TK1436, from Thermococcus kodakaraensis KOD1, and compared its structure with that of AmyC. TK1436 was found to be a GBE; it features a long and flexible so-called catalytic loop (residues 225-245, TK1436 numbering) folding towards the active site with a tyrosine residue at its tip (Tyr233, TK1436 numbering); this loop was shown to be essential for branching activity and proposed to be involved in substrate binding and/or intermediate product stabilization (Palomo et al. 2011;Santos et al. 2011). AmyC showed a considerably shorter catalytic loop, lacking the corresponding tyrosine residue as well as another conserved tryptophan residue lining the active site groove (Trp270, TK1436 numbering). While TK1436 was found to be functional as a tetramer, AmyC is monomeric. The authors proposed that the differences in tertiary and quaternary structure relate to the fact that AmyC only showed hydrolytic activity on starch-like substrates. This hypothesis was further supported by the observation that also TT1467 was characterized as a GBE (PDB entry 3P0B (Palomo et al. 2011)) and features the same structural elements as TK1436, but differs from AmyC regarding those.
Nevertheless, a detailed bioinformatic analysis of GH57 enzymes (Blesak and Janecek 2012) clearly showed that AmyC contains the sequence fingerprint of GBE's; thus, it remained intriguing why the biochemical characterization of AmyC ) only revealed hydrolytic and not transglycosylation (branching) activity. We therefore investigated the phylogeny, activity and three-dimensional structure of AmyC in more detail. This communication presents biochemical evidence in support of the in silico analysis that AmyC is indeed a GBE with relatively high hydrolytic activity (up to 30% of the total activity), and suggests which structural features are responsible for its specificity. Finally, three putative GH57 GBEs are identified based on structural homology to AmyC, suggesting that GH57 GBEs with relatively high hydrolytic activity are more widespread in mesophilic and thermophilic microorganisms.
For all 64 GH57 enzymes and proteins (Supplementary Table S1), their five well-established conserved sequence regions (CSRs) (Zona et al. 2004) were identified according to previous bioinformatics analyses Janecek 2012, 2013;Janecek and Blesak 2011;Martinovičová and Janeček 2018). The evolutionary tree was calculated based on the alignment of five CSRs mentioned above as a Phylip-tree type using the neighbour-joining clustering (Saitou and Nei 1987) and the bootstrapping procedure (Felsenstein 1985) (the number of bootstrap trials used was 1000) implemented in the Clustal-X package (Larkin et al. 2007). The tree was displayed with the program iTOL (http://itol.embl.de/) (Letunic and Bork 2006). Sequence logos were created using the WebLogo 3.0 server (http://weblogo.threeplusone.com/) (Crooks et al. 2004) for CSRs of all 40 GBE sequences as well as of the single AmyC from T. maritima.

Expression and purification of AmyC
A codon-optimized gene (Genbank ID: MK704497) encoding the GBE from T. maritima SMB8 (AmyC) was synthesized by Baseclear (Leiden, Netherlands), and cloned into pRSET B (Invitrogen) behind the His-tag sequence of the vector. Gene sequence details are provided in the supplemental information. AmyC was overexpressed in Escherichia coli BL21(DE3), cultivated in Luria-Bertani (LB) medium (10 g/ L of tryptone, 5 g/L yeast extract and 10 g/L NaCl) supplemented with 100 μg/mL ampicillin. GBE expression was induced with 0.1 mM IPTG when the culture had an OD 600 of 0.8; the induction was carried out at 16°C for 20 h. Cells were harvested by centrifugation (5000×g, 10 min, 4°C), washed twice with 10 mM sodium phosphate buffer pH 7.0, and lysed using a high-pressure homogenizer (Emulsiflex-B15; Avestin, Canada) in two cycles at 9.0 MPa and room temperature. A cell-free extract was obtained after centrifugation (20,000×g, 30 min, 4°C). AmyC was purified in two steps: first, the cellfree extract was subjected to 70°C for 15 min, followed by removal of the denatured proteins by centrifugation (20,000×g, 30 min, 4°C). The His-tagged AmyC was purified using the HisPurTM Ni-NTA Resin (Thermo Fisher Scientific, Waltham, USA) according to the manufacturer's protocol. Protein concentration was quantified using the Quick Start™ Bradford Protein Assay kit (Bio-Rad Laboratories, Hercules, USA). The purity and molecular mass of the proteins were checked by SDS-PAGE.

Enzyme activity assays
The enzyme activity was analysed using the iodine staining assay and monitoring the decrease of absorbance of the glucan-iodine complex (Boyer and Preiss 1978). Amylose V with an average DP of 700 was selected as substrate because it has no detectable α-1,6-linkages by NMR; so, any α-1,6-linkage detected is the primed produced by the result of the action of the GBE.
Amylose V (0.125% (w/v) dissolved in 50 mM sodium phosphate buffer (pH 7.0) was incubated with 132.5 μg/mL AmyC at 50°C. Ten microliters of aliquot was taken into 96well plate, and 150 μL iodine reagent (aqueous solution of 0.0127% I 2 and 0.035% KI) was added, and the absorption at 660 nm was determined. One unit of enzyme activity is defined as the amount of enzyme that gives a decrease in absorbance of the amylose/iodine complex of 1.0 absorbance unit per minute at 660 nm (Palomo et al. 2011).
The influence of Ca 2+ on AmyC activity was tested at 50°C to 80°C in 50 mM Tris-Cl buffer, and the pH 7.5 was adjusted at a series reaction temperatures. 0.1, 1, 3 and 5 mM CaCl 2 were applied in the reaction. The activity was measured by iodine assay described as above.
The hydrolytic and transglycosylation activity of AmyC with amylose V as substrate were determined by measuring the increase in reducing ends by the bicinchoninic acid (BCA) method before and after debranching the product, respectively. Amylose V was dissolved in 1 M sodium hydroxide and then neutralized to pH 7.0. A mixture of 0.125% (w/v) amylose V in 50 mM sodium phosphate buffer (pH 7.0) and 132.5 μg/mL AmyC was incubated at 50°C. Samples of 500 μL were taken at different time points and the AmyC was inactivated by incubating the samples at 100°C for 10 min. To debranch the product, 200 μL sample was mixed with 1 μL 1 M citrate acid, 1 U isoamylase, 0.7 U pullulanase and 1 μL 1 M CaCl 2 and then incubated at 40°C for 16 h. The hydrolytic activity was measured by following the increase in reducing ends during the reaction as each product of hydrolysis bears a terminal, reducing glucose residue. Transglycosylation, or branching activity, was measured by treating the reaction product with the debranching enzymes isoamylase and pullulanase, enzymes that specifically hydrolyze α-1,6linkages; the product of the specific hydrolysis of α-1,6linkage will also bear a terminal, reducing glucose residue. The increase in reducing ends is the amount of reducing ends after debranching minus the amount of reducing ends b e f o r e d e b r a n c h i n g , a d i r e c t r e s u l t o f t h e transglycosylation/branching activity. One unit branching activity is defined as 1 μmol of α-1,6-linkage synthesized per minute and one unit hydrolytic activity is defined as 1 μmol of reducing end synthesized per minute.

Influence of pH and temperature on activity
The influence of pH on AmyC activity was measured at 50°C in 50 mM sodium phosphate buffer (pH 6.0 to 9.0) by using the iodine assay as described above. The influence of temperature on AmyC activity was determined at pH 7.0 in 50 mM sodium phosphate buffer using the activity assay mixture incubated at temperature ranging 40 to 90°C.

High performance anion exchange chromatography
Oligosaccharide analysis was carried out by high-performance anion exchange chromatography (HPAEC) on a Dionex ICS-3000 system (Thermo Scientific) equipped with a 4 × 250 mm CarboPac PA-1 column. A pulsed amperometric detector with a gold electrode and an Ag/AgCl pH reference electrode were used. The system was run with a gradient of 30-600 mM NaAc in 100 mM NaOH 1 mL/min. Chromatograms were analysed using Chromeleon 6.8 chromatography data system software (Thermo Scientific). A mixture of glucose, maltose, maltotriose, maltotetraose, maltopentaose, maltohexaose and maltoheptaose was used as reference. AmyC-modified product was dialyzed using dialysis tubing with a cutoff size of 100 Da to 500 Da in ultrapure water. Two milligrams of dry material was dissolved into 1 mL 5 mM sodium acetate buffer pH 5.0 with 5 mM CaCl 2 . Five hundred microliters of solution was mixed with 2.5 U isoamylase and 1.75 U pullulanase, and incubated at 40°C for 16 h. Degree of branching ¼ S α−1;6 S α−1;4 þ S α−1;6 S α-1,6 is the peak area of α-1,6, integrated from NMR spectra; S α-1,4 is the peak area of α-1,4, integrated from NMR spectra.

Sequence analysis of GH57 GBEs
Analysis of the GH57 GBE sequences and phylogenetic tree construction was performed as described in the 'Materials and methods' section. Sequences of 40 GBEs (Supplementary  Table S1) were collected based on the recent exhaustive in silico analysis of the entire α-amylase family GH57 (Martinovičová and Janeček 2018) that, of all 1602 GH57 sequences taken from the CAZy database (Cantarel et al. 2009), yielded 546 GBEs. Forty GBEs were selected in an effort to obtain a representative sample of GBE sequences having, in addition to AmyC from T. maritima Dickmanns et al. 2006), the biochemically characterized enzymes from T. kodakaraensis (Murakami et al. 2006;Santos et al. 2011), T. thermophilus (Palomo et al. 2011) and P. horikoshi (Na et al. 2017), accompanied by a range of hypothetical GBEs covering various taxa from both Bacteria and Archaea including all available sequences from the phylum Thermotogae. In order to perform as relevant as possible analysis and in accordance with previous in silico studies (Blesak and Janecek 2012;Blesak and Janecek 2013;Janecek and Blesak 2011;Martinovičová and Janeček 2018), the set of sequences was completed by 23 biochemically characterized family GH57 members representing other enzyme specificities, accompanied by one putative representative of the α-amylase-like protein (Supplementary Table S1).
The evolutionary tree constructed of these selected sequences shows that the (putative) GH57 GBEs cluster together (Fig. 1). Comparison of the CSRs of the 40 (putative) GH57 GBEs reveals that most, but not all, of the amino acid residues of the CSRs are conserved (Fig. 2). AmyC was also alignment with the three characterized GH57 GBE, demonstrating their similarity, though at the same time also revealing that some loops are distinct (Fig. S1).

Activity
AmyC was over expressed in E. coli and purified to homogeneity as judged by SDS-page (Fig. 3a). The previous studies report optimal conditions of 90°C and pH 8.5 Dickmanns et al. 2006). In a first approach, the activity of purified recombinant enzyme was tested at 90°C and pH 8.5 by using the iodine staining assay and amylose V as substrate. The absorbance of glucan-iodine complex did not change, which showed that the recombinant AmyC was not active at these conditions. Subsequently, the influence of temperature and pH on AmyC activity was investigated in detail. AmyC showed activity at temperatures of 80°C and below. Maximum activity was found at 50°C and pH 7.0 (Fig. 3). AmyC was not active in the presence of Ca 2+ at high temperature, which is in agreement with Ballschmiter et al. (2006).  Table S1). The label of each GBE (shown in red) consists of the GenBank accession number, letter BA^or BB^indicating the archaeal and bacterial origin, respectively, and the name of the organism; for remaining enzyme specificities, also the abbreviation of the enzyme name is added as follows: AAMY, α-amylase; AAMY-like protein, α-amylase-like protein; 4AGT, 4-α-glucanotransferase; APU, amylopullulanase; APU-CMD, amylopullulanase/cyclomaltodextrinase; AMY, non-specified amylase; MGA, maltogenic amylase; AGAL, αgalactosidase. The four GBEs with known three-dimensional structures are marked by an asterisk. The AmyC from Thermotoga maritima is emphasized by colour inversion. All GBEs from the phylum Thermotogae (i.e. genera Thermotoga, Kosmotoga, Mesotoga, Defluviitoga, Petrotoga and Pseudothermotoga) are signified by red full circles Fig. 2 Sequence logos of 40 GBEs from the family GH57 analysed in the present study (top) and of the AmyC from Thermotoga martitima (bottom). CSR-1, residues 1-5; CSR-2, residues 6-11; CSR-3, residues 12-17; CSR-4, residues 18-27; CSR-5, residues 28-36. The catalytic nucleophile (at position 15, glutamate) and the proton donor (at position 20, aspartate) in both logos are indicated by asterisks AmyC was incubated with amylose V and samples were taken in time. The amount of reducing ends increased gradually during the reaction (Fig. 4a), due to the α-amylase activity of AmyC. However, importantly, also a clear increase in reducing ends was found when the product of the incubation of amylose Vand AmyC was treated with isoamylase/pullulanase (Fig. 4a). From the increase in reducing ends before and after debranching, the hydrolytic and transglycosylating activity were calculated. The total activity of AmyC is 12 mU/mg protein calculated from reducing end at 0 h and 2 h, consisting of a transglycosylating activity of 9 mU/mg protein and a hydrolytic activity of 3 mU/mg protein.
The 24-h product derived from amylose V was analysed by HPAEC-PAD before and after debranching (Fig. 6). AmyC produced mainly branched products and linear oligosaccharides of DP 1 to 8 as by-products with minor amounts of short-chainbranched oligosaccharides, visible as small peaks directly next to the larger linear oligosaccharide peaks (Fig. 5a). After debranching, more short-chain linear oligosaccharides and in addition longer linear oligosaccharides were found, representing the newly synthesized side chains (Fig. 5b). AmyC introduces side chains ranging from DP 2 up to DP 30, with DP 5 as the most abundant side chains (Fig. 5b). The average chain length is 6.6.

Structural analysis of AmyC
Superposition of AmyC (PDB entry 2B5D) with GH57 GBEs TT1467 (PDB entry 3P0B) and TK1436 (PDB entry 3N98) resulted in root mean square deviations of 1.21 and 1.34 Å (on Cα atoms), respectively, in accordance with their very similar core architecture (Fig. 6). Differences are mainly observed in a few loops, some of which are near the proposed substratebinding groove. Importantly, the loop of AmyC (residues 213- Fig. 4 The branching and hydrolytic activity of AmyC (a), and 1 H-NMR spectra of AmyC branched α-glucan (b). The activities are quantified by following the increase in reducing ends over time. Amylose V (0.125%) was incubated with AmyC (132.5 μg/mL) in phosphate buffer with pH 7.0 at 50°C. The samples were debranched by isoamylase and pullulanase. 1 H-NMR spectra of branched α-glucan made by AmyC from amylose V in D 2 O, recorded at 50°C Fig. 3 SDS-PAGE of purified recombinant AmyC (a), and effect of temperature (b) and pH (c) on AmyC activity. The activity was measured by iodine staining assay. Amylose V was used as substrate with DP 700, and the samples were measured every 30 min and total incubation time is 3 h 220) equivalent to the catalytic loop of TT1467/TK1436 is 12 or 11 residues shorter, and, as noted by Santos et al. (2011), cannot reach the acceptor subsites of the active site groove; residue Tyr220 in this loop lies at the side of domain A, about 30 Å from the catalytic site. A second distinct feature of AmyC is the helical element comprising residues 239-246; it is shifted towards the catalytic site and has an imperfect α-helical conformation. This α-helix carries Trp246, equivalent to the gatekeeper Trp274/Trp270 of TT1467/TK1436, but with a sidechain conformation that buries it in the protein interior. Third, the loop connecting the 2nd and 3rd long α-helix in domain C, designated 'lid 2' in T. litoralis 4-α-glucanotransferase (TlGT), is only partly conserved and has a conformation that brings it closer to the active site groove. Finally, at the lower end of the active site groove, the loop connecting helix α1 and strand β1, carrying residue F24, runs different from other GH57 GBEs (not shown). From Table 1, it is obvious that of the aromatic 'gatekeeper' residues of TT1467/TK1436, residues Trp29, Trp402 and Trp411 of AmyC are conserved; however, Trp246 cannot function due to its buried conformation. Moreover, two of the five other aromatic residues contributing to a hydrophobic substrate-binding groove are at a different position or absent in AmyC. Taken together, the AmyC active site groove, compared to TT1467/TK1436, is more open at the acceptor-binding end (Fig. 7) and has likely less affinity for acceptor carbohydrates.

Structural homology modelling of putative GH57 branching enzymes
Structural homology models of the putative GH57 GBEs from Mesotoga prima, Kosmotoga olearia and Kosmotoga pacifica were generated using the Phyre server (Kelley et al. 2015). The generated homology models of the M. prima, K. olearia and K. pacifica putative GBEs were superimposed with the AmyC crystal structure, resulting in very low root mean square deviations of 0.22, 0.23 and 0.17 Å, for 452, 415 and 434 Cα atoms, respectively. Fig. 6 Structural comparison of GH57 GBEs, showing AmyC (left), TT1467 (middle) and TK1436 (right) with structural differences around the active site groove; highlighted are the catalytic loop (red), the helix with the gatekeeper tryptophan residue (orange), lid 2 (green) and some of the other aromatic residues as well as the two catalytic residues. Importantly, in AmyC, the catalytic loop is too short to reach the active site as it does in TT1467 and TK1436; e.g. tyrosine 220 in this loop is at about 30-Å distance from the catalytic site. Moreover, residue W246 of AmyC is in a buried conformation, unable to function as a binding platform for acceptor carbohydrates. Residues not visible in the structures are indicated with a dotted line

Discussion
AmyC from T. maritima (Dickmanns et al. 2006) was originally described as a family GH57 α-amylase randomly hydrolyzing amylose and soluble starch forming glucose, maltose and maltotriose as the main products . However, the eventual α-glucan branching activity was first ascribed to AmyC based on a detailed in silico sequence analysis (Blesak and Janecek 2012) mainly due to presence of a cysteine residue (Cys186) in the CSR-3 (Fig.  2), which was suggested to be a clear branching enzyme sequence feature. Additional support for AmyC branching activity is the presence of a tyrosine (Tyr220) corresponding with Tyr236 of T. thermophilus GBE (Fig. S1). This residue is positioned in a loop (235_PYGEAALG) between the CSR-3 and CSR-4; this loop (also called the catalytic loop) was considered essential for enzyme specificity because the Y236A mutant lost the branching activity with simultaneous  (red); distorted helix (orange); lid 2 (green). The approximate positions of one donor subsite (−1) and three acceptor subsites (+1 to +3) are derived from a superposition with the structure of T. litoralis 4-α-glucanotransferase (TlGT) complexed with acarbose (not shown). As a result of the much shorter catalytic loop (red) in AmyC, its active site groove is less occluded at acceptor subsites