Skip to main content
Log in

Reverse engineering chemical structures from molecular descriptors: how many solutions?

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Summary

Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kier L.B., (1985).Quant. Struct.-Act. Relat. 4: 109

    Article  CAS  Google Scholar 

  2. Randic M., (1975) J. Am. Chem. Soc. 97: 6609

    Article  CAS  Google Scholar 

  3. Wiener H., (1947) J. Am. Chem. Soc. 69: 17

    Article  CAS  Google Scholar 

  4. Balaban A.T., (1994) J. Chem. Inf. Comput. Sci. 34: 398

    Article  CAS  Google Scholar 

  5. Bonchev D., Trinajstic N., (1982) Int. J. Quantum Chem. 16: 463

    CAS  Google Scholar 

  6. Tong W., Lowis D.R., Perkins R., Chen Y., Welsh W.J., Goddette D.W., Heritage T., Sheehan D.M., (1998) J. Chem. Inf. Comput. Sci. 38: 669

    Article  CAS  Google Scholar 

  7. Zefirov N.S., Palyulin V.A., (2002) J. Chem. Inf. Comput. Sci. 45: 1112

    Article  Google Scholar 

  8. Bender A., Mussa H.Y., Glen R.C., Reiling S., (2004) J. Chem. Inf. Comput. Sci. 44: 170

    Article  CAS  Google Scholar 

  9. Filimonov D.A., Poroikov V., Borodina Y., Gloriozova T., (1999) J. Chem. Inf. Comput. Sci. 39: 666

    Article  CAS  Google Scholar 

  10. Poroikov V.V., Filimonov D.A., Ihlenfeldt W.-D., Gloriozova T.A., Lagunin A.A., Borodina Y.V., Stepanchikova A.V., Nicklaus M.C., (2003) J. Chem. Inf. Comput. Sci. 43: 228

    Article  CAS  Google Scholar 

  11. Faulon J.-L., (1994) J. Chem. Inf. Comput. Sci. 34: 1204

    Article  CAS  Google Scholar 

  12. Faulon J.L., Visco D.P. Jr., Pophale R.S., (2003) J. Chem. Inf. Comput. Sci. 43: 707

    Article  CAS  Google Scholar 

  13. Faulon J.-L., Collins M.J., Carr R.D., (2004) J. Chem. Inf. Comput. Sci. 44: 427

    Article  CAS  Google Scholar 

  14. Churchwell C.J., Rintoul M.D., Martin S., Visco D.P., Kotu A., Larson R.S., Sillerud L.O., Brown D.C., Faulon J.-L., (2004) J. Mol. Graph. Model 22: 263

    Article  CAS  Google Scholar 

  15. Faulon J.-L., (1996) J. Chem. Inf. Comput. Sci. 36: 731

    Article  CAS  Google Scholar 

  16. Faulon J.-L., Churchwell C.J., J.D.P.V. Jr., (2003) J. Chem. Inf. Comput. Sci. 43: 721

    Article  CAS  Google Scholar 

  17. Sheridan R.P., Kearsley S.K., (1995) J. Chem. Inf. Comput. Sci. 35: 310

    Article  CAS  Google Scholar 

  18. Venkatasubramanian V., Chen K., Caruthers J.M., (1995) J. Chem. Inf. Comput. Sci. 35: 188

    Article  CAS  Google Scholar 

  19. Kvasnicka V., Pospichal J., (1996) J. Chem. Inf. Comput. Sci. 36: 516

    Article  CAS  Google Scholar 

  20. Hall L.H., Dailey R.S., Kier L.B., (1993) J. Chem. Inf. Comput. Sci. 33: 598

    Article  CAS  Google Scholar 

  21. Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 143

    CAS  Google Scholar 

  22. Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 148

    Google Scholar 

  23. Skvortsova M.I., Baskin I.I., Slovokhotova O.L., Palyulin V.A., Zefirov N.S., (1993) J. Chem. Inf. Comput. Sci. 33: 630

    Article  CAS  Google Scholar 

  24. Godden J.W., Stahura F.L., Bajorath J., (2000) J. Chem. Inf. Comput. Sci. 40: 796

    Article  CAS  Google Scholar 

  25. Cover, T.M. and Thomas, J.A., Elements of Information Theory. Wiley Series in Telecommunications, ed. Wiley. John Wiley & Sons, Inc., New York, 1991, 542 pp

  26. Matlab 7. MathWorks, (2005)

  27. Hawkins D.M., (2004) J. Chem. Inf. Comput. Sci. 44: 1

    Article  CAS  Google Scholar 

  28. Joachims , T., In Scholkopf, B., Burges, C.J.C., Smola, A.J., (Eds.), Advances in Kernel Methods-Support Vector Learning MIT Press Cambridge, MA 169, 1999

  29. Hart, W.E., SGOPT: A C++ Library of Global Optimization Methods. in IMSL. 1997

  30. Faulon J.-L., (1992) J. Chem. Inf. Comput. Sci. 32: 338

    CAS  Google Scholar 

  31. PubChem. National Library of Medicine, (2005)

  32. Bicerno, J., Prediction of Polymer Properties. 3rd Edtion. Marcel Dekker, New York, 2002

  33. Zernov V.V., Balakin K.V., Ivaschenko A.A., Savchuk N.P., Pletnev I.V., (2003) J. Chem. Inf. Comput. Sci. 43: 2048

    Article  CAS  Google Scholar 

  34. Momma, M. and Bennett, K.P., In SIAM Proceedings Series, Arlington, 2002

  35. Quang, A.T., Zhang, Q.-L. and Xing, L., In Proceedings of the First International Conference on Machine Learning and Cybernetics, Beijing, 2002

  36. Bender A., Mussa H.Y., Glen R.C., Reiling S., (2004) J. Chem. Inf. Comput. Sci. 44: 1708

    Article  CAS  Google Scholar 

  37. Baskin I.I., Skvortsova M.I., Stankevich I.V., Zefirov N.S., (1995) J. Chem. Inf. Comput. Sci. 35: 527

    Article  CAS  Google Scholar 

  38. Skvortsova M.I., Baskin I.I., Skvortsova L.A., Palyulin V.A., Stankevich I.V., Zefirov N.S., (1999) Theochem: J. Mol. Struct. 466: 211–217

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was funded in part by the U.S. Department of Energy’s Genomics: GTL program (www.doegenomestolife.org) under project, “Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling” (www.genomes-to-life.org). This work was also funded by Sandia National Laboratories Computer Science Research Fund. Sandia is a multiprogram laboratory operated by Sandia Corporation, a LockheedMartin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Loup Faulon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Faulon, JL., Brown, W.M. & Martin, S. Reverse engineering chemical structures from molecular descriptors: how many solutions?. J Comput Aided Mol Des 19, 637–650 (2005). https://doi.org/10.1007/s10822-005-9007-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-005-9007-1

Keywords

Navigation