Abstract
Molecular design is essential and ubiquitous in chemistry, physics, biology, and material science. The immense space of available candidate molecules requires novel optimization strategies and algorithms for exploring the space and achieving efficient and effective molecular design. This paper summarizes the current progress toward developing practical theoretical optimization schemes for molecular design. In particular, we emphasize emergent strategies for inverse molecular design. Several representative design examples, based on recently developed strategies, are described to demonstrate the principles of inverse molecular design.
Similar content being viewed by others
References
Kolb H C, Finn M G, Sharpless K B. Click chemistry: Diverse chemical function from a few good reactions. Angew Chem, Int Ed, 2001, 40(11): 2004–2021
Terrett N K, Gardner M, Gordon D W, Kobylecki R J, Steele J. Combinatorial synthesis — the design of compound libraries and their application to drug discovery. Tetrahedron, 1995, 51(30): 8135–8173
Thompson L A, Ellman J A. Synthesis and applications of small molecule libraries. Chem Rev, 1996, 96(1): 555–600
Xiang X D, Sun X D, Briceno G, Lou Y L, Wang K A, Chang H Y, Wallacefreedman W G, Chen S W, Schultz P G. A combinatorial approach to materials discovery. Science, 1995, 268(5218): 1738–1740
Hertzberg R P, Pope A J. High-throughput screening: new technology for the 21st century. Curr Opin Chem Biol, 2000, 4(4): 445–451
Jorgensen W L. The many roles of computation in drug discovery. Science, 2004, 303(5665): 1813–1818
van de Waterbeemd H, Gifford E. ADMET in silico modelling: Towards prediction paradise? Nat Rev Drug Discovery, 2003, 2(3): 192–204
Fink T, Bruggesser H, Reymond J L. Virtual exploration of the small-molecule chemical universe below 160 daltons. Angew Chem, Int Ed, 2005, 44(10): 1504–1508
Fink T, Reymond J L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J Chem Inf Model, 2007, 47(2): 342–353
Walters W P, Stahl M T, Murcko M A. Virtual screening — an overview. Drug Discov Today, 1998, 3(4): 160–178
Wang M L, Hu X Q, Beratan D N, Yang W T. Designing molecules by optimizing potentials. J Am Chem Soc, 2006, 128(10): 3228–3232
Das R, Baker D. Macromolecular modeling with Rosetta. Annu Rev Biochem, 2008, 77: 363–382
Franceschetti A, Zunger A. The inverse hand-structure problem of finding an atomic configuration with given electronic properties. Nature, 1999, 402(6757): 60–63
Ertl P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J Chem Inf Comp Sci, 2003, 43(2): 374–380
Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature, 2004, 432(7019): 855–861
Hu X Q, Beratan D N, Yang W T. A gradient-directed Monte Carlo approach to molecular design. J Chem Phys, 2008, 129(6): 064102–064111
Piquini P, Graf P A, Zunger A. Band-gap design of quaternary (In,Ga)(As,Sb) semiconductors via the inverse-band-structure approach. Phys Rev Lett, 2008, 100(18): 186403–186407
Trimarchi G, Zunger A. Finding the lowest-energy crystal structure starting from randomly selected lattice vectors and atomic positions: first-principles evolutionary study of the Au-Pd, Cd-Pt, Al-Sc, Cu-Pd, Pd-Ti, and Ir-N binary systems. J Phys: Condens Matter, 2008, 20(29): 295212–295223
Gordon D B, Mayo S L. Branch-and terminate: a combinatorial optimization algorithm for protein design. Struct Fold Des, 1999, 7(9): 1089–1098
Pearl J, Korf R E. Search techniques. Annu Rev Comput Sci, 1987, 2: 451–467
Goldberg D E. Genetic Algorithms in Search, Optimization, and Machine Learning. New York: Addison-Wesley, 1989
Balamurugan D, Yang W T, Beratan D N. Exploring chemical space with discrete, gradient, and hybrid optimization methods. J Chem Phys, 2008, 129(17): 174105–174114
Congreve M, Carr R, Murray C, Jhoti H. A rule of three for fragment-based lead discovery? Drug Discov Today, 2003, 8(19): 876–877
Hajduk P J, Greer J. A decade of fragment-based drug design: strategic advances and lessons learned. Nat Rev Drug Discovery, 2007, 6(3): 211–219
Schneider G, Lee ML, Stahl M, Schneider P. De novo design of molecular architectures by evolutionary assembly of drug-derived building blocks. J Comput-Aided Mol Des, 2000, 14(5): 487–494
Park S, Xi Y, Saven J G. Advances in computational protein design. Curr Opin Struct Biol, 2004, 14(4): 487–494
Bonneau R, Baker D. Ab initio protein structure prediction: Progress and prospects. Annu Rev Biophys Biomol Struct, 2001, 30: 173–189
Jones G, Willett P, Glen R C, Leach A R, Taylor R. Development and validation of a genetic algorithm for flexible docking. J Mol Biol, 1997, 267(3): 727–748
Chakraborti N. Genetic algorithms in materials design and processing. Int Mater Rev, 2004, 49(3–4): 246–260
Dudiy S V, Zunger A. Searching for alloy configurations with target physical properties: Impurity design via a genetic algorithm inverse band structure approach. Phys Rev Lett, 2006, 97(4): 046401–046405
d’Avezac M, Zunger A. Finding the atomic configuration with a required physical property in multi-atom structures. J Phys: Condens Matter, 2007, 19(40): 402201–402208
d’Avezac M, Zunger A. Identifying the minimum-energy atomic configuration on a lattice: Lamarckian twist on Darwinian evolution. Phys Rev B, 2008, 78(6): 064102–064117
Piquini P, Zunger A. Using superlattice ordering to reduce the band gap of random (In,Ga)As/InP alloys to a target value via the inverse band structure approach. Phys Rev B, 2008, 78(16): 161302–161306
Marder S R, Beratan D N, Cheng L T. Approaches for optimizing the 1st electronic hyerpolarizability of conjugated organic molecules. Science, 1991, 252(5002): 103–106
Risser S M, Beratan D N, Marder S R. Structure-function-relationships for beta, the 1st molecular hyperpolarizability. J Am Chem Soc, 1993, 115(17): 7719–7728
Kuhn C, Beratan DN. Inverse strategies for molecular design. J Phys Chem, 1996, 100(25): 10595–10599
Yang W T, Ayers P W, Wu Q. Potential functionals: Dual to density functionals and solution to the upsilon-representability problem. Phys Rev Lett, 2004, 92(14): 146404–146408
Yang W T, Wu Q. Direct method for optimized effective potentials in density-functional theory. Phys Rev Lett, 2002, 89(14): 143002–143006
Wu Q, Yang W T. A direct optimization method for calculating density functionals and exchange-correlation potentials from electron densities. J Chem Phys, 2003, 118(6): 2498–2509
Xiao D Q, Yang W T, Beratan D N. Inverse molecular design in a tight-binding framework. J Chem Phys, 2008, 129(4): 044106–044114
Keinan S, Hu X Q, Beratan D N, Yang W T. Designing molecules with optimal properties using the linear combination of atomic potentials approach in an AM1 semiempirical framework. J Phys Chem A, 2007, 111(1): 176–181
Hu X, Beratan D N, Yang W. Gradient-directed Monte Carlo method for global optimization in a discrete space: application to protein sequence design and folding. J Chem Phys, 2009, in press
Hu X, Hu H, Beratan D N, Yang W. A gradient-directed Monte Carlo approach for protein design. J Comp Chem, 2009, submitted
Lau K F, Dill K A. A lattice statistical mechanics model of the confromational and sequence spaces of proteins. Macomolecules, 1989, 22(10): 3986–3997
Chan H S, Dill K A. “Sequence space soup” of proteins and copolymers. J Chem Phys, 1991, 95(5): 3775–3787
Dill K A, Bromberg S, Yue K, Fiebig K M, Yee D P, Thomas P D, Chan H S. Principles of protein folding — A perspective from simple exact models. Protein Sci, 1995, 4(4): 561–602
Koh S K, Ananthasuresh G K. A deterministic optimization approach to protein sequence design using continuous models. Int J Rob Research, 2005, 24(2–3): 109–130
Koh S K, Ananthasuresh G K, Croke C. A quadratic programming formulation for the design of reduced protein models in continuous sequence space. J Mech Design, 2005, 127(4): 728–735
Lesh N, Mitzenmacher M, Whitesides S. A complete and effective move set for simplied protein folding. Annual Conference on Research in Computational Molecular Biology, 2003. 188–195
Kuhlman B, Dantas G, Ireton G C, Varani G, Stoddard B L, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science, 2003, 302(5649): 1364–1368
Author information
Authors and Affiliations
Corresponding authors
Additional information
Supported by the University of Pittsburgh Center for Chemical Methodologies & Library Development (2P50GM067082) and the National Science Foundation (CHE-06-16849)
WY is most grateful to Professor Guangxian Xu for introducing him to Professor Robert G. Parr, under whose supervision WY carried out his Ph. D. studies in density functional theory. Professor Xu’s scholarship has always been an inspiration for WY. He and his co-authors wish Professor Xu a happy 90th birthday and many more happy birthdays to come.
Rights and permissions
About this article
Cite this article
Hu, X., Beratan, D.N. & Yang, W. Emergent strategies for inverse molecular design. Sci. China Ser. B-Chem. 52, 1769–1776 (2009). https://doi.org/10.1007/s11426-009-0260-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11426-009-0260-3