Skip to main content
Log in

Emergent strategies for inverse molecular design

  • Published:
Science in China Series B: Chemistry Aims and scope Submit manuscript

Abstract

Molecular design is essential and ubiquitous in chemistry, physics, biology, and material science. The immense space of available candidate molecules requires novel optimization strategies and algorithms for exploring the space and achieving efficient and effective molecular design. This paper summarizes the current progress toward developing practical theoretical optimization schemes for molecular design. In particular, we emphasize emergent strategies for inverse molecular design. Several representative design examples, based on recently developed strategies, are described to demonstrate the principles of inverse molecular design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kolb H C, Finn M G, Sharpless K B. Click chemistry: Diverse chemical function from a few good reactions. Angew Chem, Int Ed, 2001, 40(11): 2004–2021

    Article  CAS  Google Scholar 

  2. Terrett N K, Gardner M, Gordon D W, Kobylecki R J, Steele J. Combinatorial synthesis — the design of compound libraries and their application to drug discovery. Tetrahedron, 1995, 51(30): 8135–8173

    Article  CAS  Google Scholar 

  3. Thompson L A, Ellman J A. Synthesis and applications of small molecule libraries. Chem Rev, 1996, 96(1): 555–600

    Article  CAS  Google Scholar 

  4. Xiang X D, Sun X D, Briceno G, Lou Y L, Wang K A, Chang H Y, Wallacefreedman W G, Chen S W, Schultz P G. A combinatorial approach to materials discovery. Science, 1995, 268(5218): 1738–1740

    Article  CAS  Google Scholar 

  5. Hertzberg R P, Pope A J. High-throughput screening: new technology for the 21st century. Curr Opin Chem Biol, 2000, 4(4): 445–451

    Article  CAS  Google Scholar 

  6. Jorgensen W L. The many roles of computation in drug discovery. Science, 2004, 303(5665): 1813–1818

    Article  CAS  Google Scholar 

  7. van de Waterbeemd H, Gifford E. ADMET in silico modelling: Towards prediction paradise? Nat Rev Drug Discovery, 2003, 2(3): 192–204

    Article  CAS  Google Scholar 

  8. Fink T, Bruggesser H, Reymond J L. Virtual exploration of the small-molecule chemical universe below 160 daltons. Angew Chem, Int Ed, 2005, 44(10): 1504–1508

    Article  CAS  Google Scholar 

  9. Fink T, Reymond J L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J Chem Inf Model, 2007, 47(2): 342–353

    Article  CAS  Google Scholar 

  10. Walters W P, Stahl M T, Murcko M A. Virtual screening — an overview. Drug Discov Today, 1998, 3(4): 160–178

    Article  CAS  Google Scholar 

  11. Wang M L, Hu X Q, Beratan D N, Yang W T. Designing molecules by optimizing potentials. J Am Chem Soc, 2006, 128(10): 3228–3232

    Article  CAS  Google Scholar 

  12. Das R, Baker D. Macromolecular modeling with Rosetta. Annu Rev Biochem, 2008, 77: 363–382

    Article  CAS  Google Scholar 

  13. Franceschetti A, Zunger A. The inverse hand-structure problem of finding an atomic configuration with given electronic properties. Nature, 1999, 402(6757): 60–63

    Article  CAS  Google Scholar 

  14. Ertl P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J Chem Inf Comp Sci, 2003, 43(2): 374–380

    CAS  Google Scholar 

  15. Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature, 2004, 432(7019): 855–861

    Article  CAS  Google Scholar 

  16. Hu X Q, Beratan D N, Yang W T. A gradient-directed Monte Carlo approach to molecular design. J Chem Phys, 2008, 129(6): 064102–064111

    Article  CAS  Google Scholar 

  17. Piquini P, Graf P A, Zunger A. Band-gap design of quaternary (In,Ga)(As,Sb) semiconductors via the inverse-band-structure approach. Phys Rev Lett, 2008, 100(18): 186403–186407

    Article  CAS  Google Scholar 

  18. Trimarchi G, Zunger A. Finding the lowest-energy crystal structure starting from randomly selected lattice vectors and atomic positions: first-principles evolutionary study of the Au-Pd, Cd-Pt, Al-Sc, Cu-Pd, Pd-Ti, and Ir-N binary systems. J Phys: Condens Matter, 2008, 20(29): 295212–295223

    Article  CAS  Google Scholar 

  19. Gordon D B, Mayo S L. Branch-and terminate: a combinatorial optimization algorithm for protein design. Struct Fold Des, 1999, 7(9): 1089–1098

    Article  CAS  Google Scholar 

  20. Pearl J, Korf R E. Search techniques. Annu Rev Comput Sci, 1987, 2: 451–467

    Article  Google Scholar 

  21. Goldberg D E. Genetic Algorithms in Search, Optimization, and Machine Learning. New York: Addison-Wesley, 1989

    Google Scholar 

  22. Balamurugan D, Yang W T, Beratan D N. Exploring chemical space with discrete, gradient, and hybrid optimization methods. J Chem Phys, 2008, 129(17): 174105–174114

    Article  CAS  Google Scholar 

  23. Congreve M, Carr R, Murray C, Jhoti H. A rule of three for fragment-based lead discovery? Drug Discov Today, 2003, 8(19): 876–877

    Article  Google Scholar 

  24. Hajduk P J, Greer J. A decade of fragment-based drug design: strategic advances and lessons learned. Nat Rev Drug Discovery, 2007, 6(3): 211–219

    Article  CAS  Google Scholar 

  25. Schneider G, Lee ML, Stahl M, Schneider P. De novo design of molecular architectures by evolutionary assembly of drug-derived building blocks. J Comput-Aided Mol Des, 2000, 14(5): 487–494

    Article  CAS  Google Scholar 

  26. Park S, Xi Y, Saven J G. Advances in computational protein design. Curr Opin Struct Biol, 2004, 14(4): 487–494

    Article  CAS  Google Scholar 

  27. Bonneau R, Baker D. Ab initio protein structure prediction: Progress and prospects. Annu Rev Biophys Biomol Struct, 2001, 30: 173–189

    Article  CAS  Google Scholar 

  28. Jones G, Willett P, Glen R C, Leach A R, Taylor R. Development and validation of a genetic algorithm for flexible docking. J Mol Biol, 1997, 267(3): 727–748

    Article  CAS  Google Scholar 

  29. Chakraborti N. Genetic algorithms in materials design and processing. Int Mater Rev, 2004, 49(3–4): 246–260

    Article  CAS  Google Scholar 

  30. Dudiy S V, Zunger A. Searching for alloy configurations with target physical properties: Impurity design via a genetic algorithm inverse band structure approach. Phys Rev Lett, 2006, 97(4): 046401–046405

    Article  CAS  Google Scholar 

  31. d’Avezac M, Zunger A. Finding the atomic configuration with a required physical property in multi-atom structures. J Phys: Condens Matter, 2007, 19(40): 402201–402208

    Article  CAS  Google Scholar 

  32. d’Avezac M, Zunger A. Identifying the minimum-energy atomic configuration on a lattice: Lamarckian twist on Darwinian evolution. Phys Rev B, 2008, 78(6): 064102–064117

    Article  CAS  Google Scholar 

  33. Piquini P, Zunger A. Using superlattice ordering to reduce the band gap of random (In,Ga)As/InP alloys to a target value via the inverse band structure approach. Phys Rev B, 2008, 78(16): 161302–161306

    Article  CAS  Google Scholar 

  34. Marder S R, Beratan D N, Cheng L T. Approaches for optimizing the 1st electronic hyerpolarizability of conjugated organic molecules. Science, 1991, 252(5002): 103–106

    Article  CAS  Google Scholar 

  35. Risser S M, Beratan D N, Marder S R. Structure-function-relationships for beta, the 1st molecular hyperpolarizability. J Am Chem Soc, 1993, 115(17): 7719–7728

    Article  CAS  Google Scholar 

  36. Kuhn C, Beratan DN. Inverse strategies for molecular design. J Phys Chem, 1996, 100(25): 10595–10599

    Article  CAS  Google Scholar 

  37. Yang W T, Ayers P W, Wu Q. Potential functionals: Dual to density functionals and solution to the upsilon-representability problem. Phys Rev Lett, 2004, 92(14): 146404–146408

    Article  CAS  Google Scholar 

  38. Yang W T, Wu Q. Direct method for optimized effective potentials in density-functional theory. Phys Rev Lett, 2002, 89(14): 143002–143006

    Article  CAS  Google Scholar 

  39. Wu Q, Yang W T. A direct optimization method for calculating density functionals and exchange-correlation potentials from electron densities. J Chem Phys, 2003, 118(6): 2498–2509

    Article  CAS  Google Scholar 

  40. Xiao D Q, Yang W T, Beratan D N. Inverse molecular design in a tight-binding framework. J Chem Phys, 2008, 129(4): 044106–044114

    Article  CAS  Google Scholar 

  41. Keinan S, Hu X Q, Beratan D N, Yang W T. Designing molecules with optimal properties using the linear combination of atomic potentials approach in an AM1 semiempirical framework. J Phys Chem A, 2007, 111(1): 176–181

    Article  CAS  Google Scholar 

  42. Hu X, Beratan D N, Yang W. Gradient-directed Monte Carlo method for global optimization in a discrete space: application to protein sequence design and folding. J Chem Phys, 2009, in press

  43. Hu X, Hu H, Beratan D N, Yang W. A gradient-directed Monte Carlo approach for protein design. J Comp Chem, 2009, submitted

  44. Lau K F, Dill K A. A lattice statistical mechanics model of the confromational and sequence spaces of proteins. Macomolecules, 1989, 22(10): 3986–3997

    Article  CAS  Google Scholar 

  45. Chan H S, Dill K A. “Sequence space soup” of proteins and copolymers. J Chem Phys, 1991, 95(5): 3775–3787

    Article  CAS  Google Scholar 

  46. Dill K A, Bromberg S, Yue K, Fiebig K M, Yee D P, Thomas P D, Chan H S. Principles of protein folding — A perspective from simple exact models. Protein Sci, 1995, 4(4): 561–602

    Article  CAS  Google Scholar 

  47. Koh S K, Ananthasuresh G K. A deterministic optimization approach to protein sequence design using continuous models. Int J Rob Research, 2005, 24(2–3): 109–130

    Article  Google Scholar 

  48. Koh S K, Ananthasuresh G K, Croke C. A quadratic programming formulation for the design of reduced protein models in continuous sequence space. J Mech Design, 2005, 127(4): 728–735

    Article  Google Scholar 

  49. Lesh N, Mitzenmacher M, Whitesides S. A complete and effective move set for simplied protein folding. Annual Conference on Research in Computational Molecular Biology, 2003. 188–195

  50. Kuhlman B, Dantas G, Ireton G C, Varani G, Stoddard B L, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science, 2003, 302(5649): 1364–1368

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to David N. Beratan or WeiTao Yang.

Additional information

Supported by the University of Pittsburgh Center for Chemical Methodologies & Library Development (2P50GM067082) and the National Science Foundation (CHE-06-16849)

WY is most grateful to Professor Guangxian Xu for introducing him to Professor Robert G. Parr, under whose supervision WY carried out his Ph. D. studies in density functional theory. Professor Xu’s scholarship has always been an inspiration for WY. He and his co-authors wish Professor Xu a happy 90th birthday and many more happy birthdays to come.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, X., Beratan, D.N. & Yang, W. Emergent strategies for inverse molecular design. Sci. China Ser. B-Chem. 52, 1769–1776 (2009). https://doi.org/10.1007/s11426-009-0260-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11426-009-0260-3

Keywords

Navigation