Abstract
A computational mutagenesis methodology that utilizes a four-body, knowledge-based, statistical contact potential is applied toward quantifying relative changes (residual scores) to sequence-structure compatibility in E. coli lac repressor due to single amino acid residue substitutions. We show that these residual scores correlate well with experimentally measured relative changes in protein activity caused by the mutations. The approach also yields a measure of environmental perturbation at every residue position in the protein caused by the mutation (residual profile). Supervised learning with a decision tree algorithm, utilizing the residual profiles of over 4000 experimentally evaluated mutants for training, classifies the mutants based on activity with nearly 79% accuracy while achieving 0.80 area under the receiver operating characteristic curve. A trained decision tree model is subsequently used to infer the levels of activity for all remaining unexplored lac repressor mutants.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bell, C.E., Lewis, M.: The Lac repressor: A second generation of structural and functional studies. Curr. Opin. Struct. Biol. 11, 19–25 (2001)
Matthews, K.S.: The whole lactose repressor. Science 271, 1245–1246 (1996)
Muller-Hill, B.: Some repressors of bacterial transcription. Curr. Opin. Microbiol. 1, 145–151 (1998)
Pace, H.C., Kercher, M.A., Lu, P., Markiewicz, P., Miller, J.H., Chang, G., Lewis, M.: Lac repressor genetic map in real space. Trends Biochem. Sci. 22, 334–339 (1997)
Lewis, M.: The lac repressor. C.R. Biol. 328, 521–548 (2005)
Muller-Hill, B.: Suppressible regulator constitutive mutants of the lactose system in Escherichia coli. J. Mol. Biol. 15, 374–376 (1966)
Muller, J., Barker, A., Oehler, S., Muller-Hill, B.: Dimeric lac repressors exhibit phase-dependent co-operativity. J. Mol. Biol. 284, 851–857 (1998)
Pfahl, M., Stockter, C., Gronenborn, B.: Genetic analysis of the active sites of lac repressor. Genetics 76, 669–679 (1974)
Platt, T., Files, J.G., Weber, K.: Lac repressor. Specific proteolytic destruction of the NH 2 -terminal region and loss of the deoxyribonucleic acid-binding activity. J. Biol. Chem. 248, 110–121 (1973)
Schmitz, A., Schmeissner, U., Miller, J.H.: Mutations affecting the quaternary structure of the lac repressor. J. Biol. Chem. 251, 3359–3366 (1976)
Alberti, S., Oehler, S., von Bergmann, B., Kramer, H., Muller-Hill, B.: Dimer-to-tetramer assembly of Lac repressor involves a leucine heptad repeat. New Biol. 3, 57–62 (1991)
Alberti, S., Oehler, S., von Bergmann, B., Muller-Hill, B.: Genetic analysis of the leucine heptad repeats of Lac repressor. Embo. J. 12, 3227–3236 (1993)
Suckow, J., Markiewicz, P., Kleina, L.G., Miller, J., Kisters-Woike, B., Muller-Hill, B.: Genetic studies of the Lac repressor. XV. J. Mol. Biol. 261, 509–523 (1996)
Markiewicz, P., Kleina, L.G., Cruz, C., Ehret, S., Miller, J.H.: Genetic studies of the lac repressor XIV. J. Mol. Biol. 240, 421–433 (1994)
Kleina, L.G., Miller, J.H.: Genetic studies of the lac repressor XIII. J. Mol. Biol. 212, 295–318 (1990)
Vaisman, I.I., Tropsha, A., Zheng, W.: Compositional preferences in quadruplets of nearest neighbor residues in protein structures: Statistical geometry analysis. In: Proceedings of the IEEE Symposia on Intelligence and Systems, pp. 163–168 (1998)
Singh, R.K., Tropsha, A., Vaisman, I.I.: Delaunay tessellation of proteins: Four body nearest-neighbor propensities of amino acid residues. J. Comput. Biol. 3, 213–221 (1996)
Masso, M., Lu, Z., Vaisman, I.I.: Computational mutagenesis studies of protein structure-function correlations. Proteins 64, 234–245 (2006)
Verzilli, C.J., Whittaker, J.C., Stallard, N., Chasman, D.: A hierarchical Bayesian model for predicting the functional consequences of amino acid polymorphisms. Applied Statistics 54, 191–206 (2005)
Krishnan, V.G., Westhead, D.R.: A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics 19, 2199–2209 (2003)
Karchin, R., Kelly, L., Sali, A.: Improving functional annotation of non-synonomous SNPs with information theory. Pac. Symp. Biocomput., 397–408 (2005)
Ng, P.C., Henikoff, S.: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)
Barber, C.B., Dobkin, D.P., Huhdanpaa, H.T.: The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software 22, 469–483 (1996)
Bell, C.E., Lewis, M.: A closer view of the conformation of the Lac repressor bound to operator. Nat. Struct. Biol. 7, 209–214 (2000)
Masso, M., Vaisman, I.I.: Comprehensive mutagenesis of HIV-1 protease: A computational geometry approach. Biochem. Biophys. Res. Commun. 305, 322–326 (2003)
Quinlan, R.: C4.5: Programs for Machine Learning, San Mateo, CA. Morgan Kaufman Publishers, San Francisco (1993)
Frank, E., Hall, M., Trigg, L., Holmes, G., Witten, I.H.: Data mining in bioinformatics using Weka. Bioinformatics 20, 2479–2481 (2004)
Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. HPL-2003-4. Hewlett-Packard Labs, Palo Alto (2003)
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
Provost, F., Domingos, P.: Well-trained PETs. CeDER Technical Report IS-00-04. Stern School of Business, New York University, New York (2001)
Dayhoff, M.O., Schwartz, R.M., Orcut, B.C. (eds.): A model for evolutionary change in proteins, Washington D.C. National Biomedical Research Foundation, vol. 5 (1978)
Chasman, D., Adams, R.M.: Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation. J. Mol. Biol. 307, 683–706 (2001)
Wrobel, J.A., Chao, S.F., Conrad, M.J., Merker, J.D., Swanstrom, R., Pielak, G.J., Hutchison, C.A.: A genetic approach for identifying critical residues in the fingers and palm subdomains of HIV-1 reverse transcriptase. Proc. Natl. Acad. Sci. U.S.A. 95, 638–645 (1998)
Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Masso, M., Hijazi, K., Parvez, N., Vaisman, I.I. (2008). Computational Mutagenesis of E. coli Lac Repressor: Insight into Structure-Function Relationships and Accurate Prediction of Mutant Activity. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-79450-9_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79449-3
Online ISBN: 978-3-540-79450-9
eBook Packages: Computer ScienceComputer Science (R0)