Abstract
Genetic regulatory proteins inducible by small molecules are useful synthetic biology tools as sensors and switches. Bacterial allosteric transcription factors (aTFs) are a major class of regulatory proteins, but few aTFs have been redesigned to respond to new effectors beyond natural aTF-inducer pairs. Altering inducer specificity in these proteins is difficult because substitutions that affect inducer binding may also disrupt allostery. We engineered an aTF, the Escherichia coli lac repressor, LacI, to respond to one of four new inducer molecules: fucose, gentiobiose, lactitol and sucralose. Using computational protein design, single-residue saturation mutagenesis or random mutagenesis, along with multiplex assembly, we identified new variants comparable in specificity and induction to wild-type LacI with its inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). The ability to create designer aTFs will enable applications including dynamic control of cell metabolism, cell biology and synthetic gene circuits.
Similar content being viewed by others
References
Weickert, M.J. & Adhya, S. A family of bacterial regulators homologous to Gal and Lac repressors. J. Biol. Chem. 267, 15869–15874 (1992).
Schell, M.A. Molecular biology of the LysR family of transcriptional regulators. Annu. Rev. Microbiol. 47, 597–626 (1993).
Gallegos, M.T., Schleif, R., Bairoch, A., Hofmann, K. & Ramos, J.L. Arac/XylS family of transcriptional regulators. Microbiol. Mol. Biol. Rev. 61, 393–410 (1997).
Ramos, J.L. et al. The TetR family of transcriptional repressors. Microbiol. Mol. Biol. Rev. 69, 326–356 (2005).
Lutz, R. & Bujard, H. Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Res. 25, 1203–1210 (1997).
Dietrich, J.A., Shis, D.L., Alikhani, A. & Keasling, J.D. Transcription factor-based screens and synthetic selections for microbial small-molecule biosynthesis. ACS Synth. Biol. 2, 47–58 (2013).
Raman, S., Rogers, J.K., Taylor, N.D. & Church, G.M. Evolution-guided optimization of biosynthetic pathways. Proc. Natl. Acad. Sci. USA 111, 17803–17808 (2014).
Lu, T.K., Khalil, A.S. & Collins, J.J. Next-generation synthetic gene networks. Nat. Biotechnol. 27, 1139–1150 (2009).
Dietrich, J.A., McKee, A.E. & Keasling, J.D. High-throughput metabolic engineering: advances in small-molecule screening and selection. Annu. Rev. Biochem. 79, 563–590 (2010).
Tang, S.-Y. & Cirino, P.C. Design and application of a mevalonate-responsive regulatory protein. Angew. Chem. Int. Edn Engl. 50, 1084–1086 (2011).
Süel, G.M., Lockless, S.W., Wall, M.A. & Ranganathan, R. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biol. 10, 59–69 (2003).
Markiewicz, P., Kleina, L.G., Cruz, C., Ehret, S. & Miller, J.H. Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as “spacers” which do not require a specific sequence. J. Mol. Biol. 240, 421–433 (1994).
Raman, S., Taylor, N., Genuth, N., Fields, S. & Church, G.M. Engineering allostery. Trends Genet. 30, 521–528 (2014).
Collins, C.H., Arnold, F.H. & Leadbetter, J.R. Directed evolution of Vibrio fischeri LuxR for increased sensitivity to a broad spectrum of acyl-homoserine lactones. Mol. Microbiol. 55, 712–723 (2005).
Cebolla, A., Sousa, C. & de Lorenzo, V. Effector specificity mutants of the transcriptional activator NahR of naphthalene degrading Pseudomonas define protein sites involved in binding of aromatic inducers. J. Biol. Chem. 272, 3986–3992 (1997).
Wise, A.A. & Kuske, C.R. Generation of novel bacterial regulatory proteins that detect priority pollutant phenols. Appl. Environ. Microbiol. 66, 163–169 (2000).
Galvão, T.C., Mencía, M. & de Lorenzo, V. Emergence of novel functions in transcriptional regulators by regression to stem protein types. Mol. Microbiol. 65, 907–919 (2007).
Scholz, O., Köstner, M., Reich, M., Gastiger, S. & Hillen, W. Teaching TetR to recognize a new inducer. J. Mol. Biol. 329, 217–227 (2003).
Tang, S.-Y., Fazelinia, H. & Cirino, P.C. AraC regulatory protein mutants with altered effector specificity. J. Am. Chem. Soc. 130, 5267–5271 (2008).
Jha, R.K., Chakraborti, S., Kern, T.L., Fox, D.T. & Strauss, C.E.M. Rosetta comparative modeling for library design: engineering alternative inducer specificity in a transcription factor. Proteins 10.1002/prot.24828 (13 May 2015).
de Los Santos, E.L.C., Meyerowitz, J.T., Mayo, S.L. & Murray, R.M. Engineering transcriptional regulator effector specificity using computational design and in vitro rapid prototyping: developing a vanillin sensor. ACS Synth. Biol. 10.1021/acssynbio.5b00090 (19 August 2015).
AbuOun, M. et al. Genome scale reconstruction of a Salmonella metabolic model: comparison of similarity and differences with a commensal Escherichia coli strain. J. Biol. Chem. 284, 29480–29488 (2009).
Jiang, L. et al. De novo computational design of retro-aldol enzymes. Science 319, 1387–1391 (2008).
Röthlisberger, D. et al. Kemp elimination catalysts by computational enzyme design. Nature 453, 190–195 (2008).
Tinberg, C.E. et al. Computational design of ligand-binding proteins with high affinity and selectivity. Nature 501, 212–216 (2013).
Kosuri, S. et al. Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips. Nat. Biotechnol. 28, 1295–1299 (2010).
Swint-Kruse, L., Elam, C.R., Lin, J.W., Wycuff, D.R. & Shive Matthews, K. Plasticity of quaternary structure: twenty-two ways to form a LacI dimer. Protein Sci. 10, 262–276 (2001).
Swint-Kruse, L., Zhan, H., Fairbanks, B.M., Maheshwari, A. & Matthews, K.S. Perturbation from a distance: mutations that alter LacI function through long-range effects. Biochemistry 42, 14004–14016 (2003).
Xu, J. & Matthews, K.S. Flexibility in the inducer binding region is crucial for allostery in the Escherichia coli lactose repressor. Biochemistry 48, 4988–4998 (2009).
DeVito, J.A. Recombineering with tolC as a selectable/counter-selectable marker: remodeling the rRNA operons of Escherichia coli. Nucleic Acids Res. 36, e4 (2008).
Rogers, J.K. et al. Synthetic biosensors for precise gene control and real-time monitoring of metabolites. Nucleic Acids Res. 43, 7648–7660 (2015).
Mirny, L.A. & Gelfand, M.S. Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. J. Mol. Biol. 321, 7–20 (2002).
Pei, J., Cai, W., Kinch, L.N. & Grishin, N.V. Prediction of functional specificity determinants from protein sequences using log-likelihood ratios. Bioinformatics 22, 164–171 (2006).
Bell, C.E. & Lewis, M. A closer view of the conformation of the Lac repressor bound to operator. Nat. Struct. Biol. 7, 209–214 (2000).
Werstuck, G. & Green, M.R. Controlling gene expression in living cells through small molecule-RNA interactions. Science 282, 296–298 (1998).
Guntas, G., Mansell, T.J., Kim, J.R. & Ostermeier, M. Directed evolution of protein switches and their application to the creation of ligand-binding proteins. Proc. Natl. Acad. Sci. USA 102, 11224–11229 (2005).
Licitra, E.J. & Liu, J.O. A three-hybrid system for detecting small ligand-protein receptor interactions. Proc. Natl. Acad. Sci. USA 93, 12817–12821 (1996).
Maynard-Smith, L.A., Chen, L.-C., Banaszynski, L.A., Ooi, A.G.L. & Wandless, T.J. A directed approach for engineering conditional protein stability using biologically silent small molecules. J. Biol. Chem. 282, 24866–24872 (2007).
Qin, Y. et al. Screening and identification of a fungal β-glucosidase and the enzymatic synthesis of gentiooligosaccharide. Appl. Biochem. Biotechnol. 163, 1012–1019 (2011).
Gossen, M. & Bujard, H. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proc. Natl. Acad. Sci. USA 89, 5547–5551 (1992).
Wang, H.H. et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature 460, 894–898 (2009).
Datsenko, K.A. & Wanner, B.L. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad. Sci. USA 97, 6640–6645 (2000).
Pédelacq, J.-D., Cabantous, S., Tran, T., Terwilliger, T.C. & Waldo, G.S. Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 24, 79–88 (2006).
Hawkins, P.C.D. et al. Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Databank and the Cambridge Structural Database. J. Chem. Inf. Model. 50, 572–584 (2010).
Hawkins, P.C.D. & Nicholls, A. Conformer generation with OMEGA: learning from the data set and the analysis of failures. J. Chem. Inf. Model. 52, 2919–2936 (2012).
Kabsch, W. XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 (2010).
Strong, M. et al. Toward the structural genomics of complexes: crystal structure of a PE/PPE protein complex from Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. USA 103, 8060–8065 (2006).
Emsley, P., Lohkamp, B., Scott, W.G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010).
Murshudov, G.N., Vagin, A.A. & Dodson, E.J. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D Biol. Crystallogr. 53, 240–255 (1997).
Winn, M.D., Isupov, M.N. & Murshudov, G.N. Use of TLS parameters to model anisotropic displacements in macromolecular refinement. Acta Crystallogr. D Biol. Crystallogr. 57, 122–133 (2001).
Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Larkin, M.A. et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948 (2007).
Majumdar, A., Rudikoff, S. & Adhya, S. Purification and properties of Gal repressor:pL-galR fusion in pKC31 plasmid vector. J. Biol. Chem. 262, 2326–2331 (1987).
Meinhardt, S. et al. Novel insights from hybrid LacI/GalR proteins: family-wide functional attributes and biologically significant variation in transcription repression. Nucleic Acids Res. 40, 11139–11154 (2012).
Magoč, T. & Salzberg, S.L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).
Kent, W.J. BLAT--the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Bolstad, B.M., Irizarry, R.A., Astrand, M. & Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).
Hadley, W. ggplot2: Elegant Graphics for Data Analysis (Springer, 2009).
Suckow, J. et al. Genetic studies of the Lac repressor. XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. J. Mol. Biol. 261, 509–523 (1996).
Acknowledgements
We thank B. Turczyk and D. Weigand for synthesizing the single-amino-acid substitution library on the Custom Array synthesizer, and G. Cuneo and V. Toxavidis for assistance with flow cytometry and FACS. We thank Rosetta@home participants for providing the computing resources necessary for this work. This work was supported by the US Department of Energy (DOE) (DE-FG02-02ER63445 to G.M.C.), a Wyss Technology Development Fellowship (to S.R.) and the US National Institute of General Medical Sciences (grant 1P41 GM103533 to S.F.). The sucralose-responsive LacI mutant was purified and crystallized with assistance from the UCLA-DOE Protein Expression Technology Center, the UCLA-DOE X-ray Crystallography Core Facility (both supported by DOE grant DE-FC02-02ER63421) and the UCLA Crystallization Core Facility; in particular we thank M. Collazo for help with protein crystallization. X-ray data collection was facilitated by M. Capel, K. Rajashankar, N. Sukumar, F. Murphy and I. Kourinov of the Northeastern Collaborative Access Team beamline 24-ID-C at the Advanced Photon Source of Argonne National Laboratory, which is supported by US National Institutes of Health grants P41 RR015301 and P41 GM103403. Use of the Advanced Photon Source is supported by the DOE under contract DE-AC02-06CH11357.
Author information
Authors and Affiliations
Contributions
N.D.T., F.J.I., G.M.C. and S.R. conceived the study. N.D.T., S.F., G.M.C. and S.R. designed experiments. N.D.T., A.S.G. and S.R. performed experiments and carried out bioinformatic studies. R.M. and D.B. generated computational protein design candidates. S.C., D.C., M.A.A. and S.K. solved the crystal structure of a sucralose-binding variant. S.K. helped with Agilent OLS chip library design. J.K.R. helped optimize screening protocols. N.D.T., A.S.G., S.F., G.M.C. and S.R. analyzed the data. N.D.T., A.S.G., S.F., G.M.C. and S.R. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
S.R., N.D.T. and G.M.C. have filed a patent application (PCT/US15/16868) covering biosensor design methods.
Integrated supplementary information
Supplementary Figure 1 Chemical structure of ligands.
Allolactose and IPTG (native and synthetic inducers of LacI, respectively), and the four new inducers – fucose, lactitol, sucralose and gentiobiose.
Supplementary Figure 2 Coverage of all single-amino-acid substitutions found in the pre-selection single-site saturation mutagenesis library at gentiobiose-responsive positions.
Residues are ordered by protein region as in Supplementary Fig. 8 for comparison. (a) For each position of wild-type LacI, 19 substitutions were synthesized for the single amino-acid substitution library. By next-generation sequencing we measured how many of the 19 possible substitutions were found either before or after negative selection for positions showing response to gentiobiose. Mutants missing from the input library are likely due to synthesis or cloning inefficiencies. (b) Most (>80%) of the positions involved in gentiobiose response were found to have at least 18 of 19 single amino-acid substitutions prior to positive selection. For all 360 positions of LacI (not shown), we found 195 (~54%) positions contained all 19 substitutions, 238 (~66%) contained at least 18, and 306 (85%) contained at least 14 substitutions.
Supplementary Figure 3 Flow cytometric characterization of aTF library screening.
RFU denotes relative fluorescence units. (a) Flow cytometry histogram of representative LacI variant library (red) before and (blue) after colicin E1 negative selection. (b) Flow cytometry histogram of representative LacI variant library with no inducer molecule present (blue) or exposed to a new target inducer molecule (red).
Supplementary Figure 4 Fold induction of WT LacI with IPTG inducer.
Dose-response curve of WT LacI with IPTG, fold induction shown on the y-axis and IPTG concentration (mM) on the x-axis.
Supplementary Figure 5 Sequence and fold induction of the top-scoring full-length Rosetta design variants.
Induction response in relative fluorescence units (RFU) with and without ligand for WT LacI and top five scoring full-length Rosetta design variants for sucralose, lactitol and fucose. WT LacI was induced with IPTG, and the full-length Rosetta design variants were induced with their respective target ligands. The mutations in each Rosetta design variant are shown above the bar graph. All ligands were supplemented at 10 mM.
Supplementary Figure 6 Comparison of fucose, lactitol and sucralose response versus single-amino-acid substitutions found after negative selection.
(a) Fucose responsive induction values are shown pink. The induction values show the maximum weighted fold-change of response after positive selection. The black outlines indicate depletion of next-generation sequencing reads for single amino-acid substitutions after negative selection. The depletion value is the log2 fold-change of reads prior to negative selection divided by the read counts after negative selection. Higher depletion values indicate position and side-chain combinations that are lost after negative selection. Read counts were quantile normalized between pre- and post-selection separately for each amplicon (see Online Methods). (b) Lactitol responsive induction values versus depletion values. (c) Sucralose responsive induction values versus depletion values. Negative depletion values are not shown.
Supplementary Figure 7 Comparison of conservation of amino acids and mutations found for fucose response.
Mutations conferring fucose response in LacI are shown as red outlines. (a) A set of 41 LacI orthologs were aligned and the frequency of amino acid utilization is shown in blue. (b) Five experimentally validated sequences of GalR/S known to bind fucose were aligned with E. coli LacI and shown with respect to LacI positions. Mutations at positions 79 and 273 overlap with preferentially conserved amino acids in the GalR set, shown with arrows. The highest inducer at position 291 was conserved in neither LacI or fucose-responding GalR/S.
Supplementary Figure 8 Comparison of gentiobiose response versus single-amino-acid substitutions found after negative selection.
Induction values for gentiobiose-responding mutants are shown in pink. The induction values show the maximum weighted fold-change of response after positive selection. The color shades outside the amino acid substitution profile denotes the location of the residue in the binding pocket, dimerization interface, DNA-binding domain or as unclassified. The black outlines indicate depletion of next-generation sequencing reads for single amino-acid substitutions. The depletion value is the log2 fold-change of reads prior to negative selection divided by the read counts after negative selection. Read counts were quantile normalized between pre- and post-selection separately for each amplicon (see Online Methods).
Supplementary Figure 9 Cross-reactivity of additional LacI variants toward three other untargeted inducers and IPTG.
For additional variants displayed in Fig. 2, a dose-response was determined for non-target ligands and IPTG. Values displayed represent the highest fold induction at any ligand concentration. Inducers are colored as follows: gentiobiose, red; fucose, green; lactitol, blue; sucralose, magenta; and IPTG, black. Variants displayed were designed for binding to (a) gentiobiose, (b) fucose, (c) lactitol, and (d) sucralose. Error bars represent standard deviation of fold induction from three biological replicates.
Supplementary Figure 10 User guide for aTF redesign.
A detailed flowchart that guides the user through the choice of mutagenesis methods based on the choice of the target ligand for aTF redesign. We offer general guidelines on what we consider acceptable fold induction and specificity values by target and native ligands, presented as proportion of WT aTF induction, after the two-stage enrichment screen and following activity maturation. These guidelines could be adjusted on a case-by-case basis depending on the number and quality of ligand-responsive variants after the two-stage enrichment screen, and the nature of downstream application.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–10, Supplementary Tables 1–6 and Supplementary Note (PDF 1962 kb)
Rights and permissions
About this article
Cite this article
Taylor, N., Garruss, A., Moretti, R. et al. Engineering an allosteric transcription factor to respond to new ligands. Nat Methods 13, 177–183 (2016). https://doi.org/10.1038/nmeth.3696
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3696
- Springer Nature America, Inc.
This article is cited by
-
An orthogonalized PYR1-based CID module with reprogrammable ligand-binding specificity
Nature Chemical Biology (2024)
-
Development and optimization of a modular two-fragment LacI switch for enhanced biosensor applications
Biotechnology and Bioprocess Engineering (2024)
-
Engineered autonomous dynamic regulation of metabolic flux
Nature Reviews Bioengineering (2023)
-
Ligand-specific changes in conformational flexibility mediate long-range allostery in the lac repressor
Nature Communications (2023)
-
Advances of Predicting Allosteric Mechanisms Through Protein Contact in New Technologies and Their Application
Molecular Biotechnology (2023)