Summary
Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.
Similar content being viewed by others
References
Kier L.B., (1985).Quant. Struct.-Act. Relat. 4: 109
Randic M., (1975) J. Am. Chem. Soc. 97: 6609
Wiener H., (1947) J. Am. Chem. Soc. 69: 17
Balaban A.T., (1994) J. Chem. Inf. Comput. Sci. 34: 398
Bonchev D., Trinajstic N., (1982) Int. J. Quantum Chem. 16: 463
Tong W., Lowis D.R., Perkins R., Chen Y., Welsh W.J., Goddette D.W., Heritage T., Sheehan D.M., (1998) J. Chem. Inf. Comput. Sci. 38: 669
Zefirov N.S., Palyulin V.A., (2002) J. Chem. Inf. Comput. Sci. 45: 1112
Bender A., Mussa H.Y., Glen R.C., Reiling S., (2004) J. Chem. Inf. Comput. Sci. 44: 170
Filimonov D.A., Poroikov V., Borodina Y., Gloriozova T., (1999) J. Chem. Inf. Comput. Sci. 39: 666
Poroikov V.V., Filimonov D.A., Ihlenfeldt W.-D., Gloriozova T.A., Lagunin A.A., Borodina Y.V., Stepanchikova A.V., Nicklaus M.C., (2003) J. Chem. Inf. Comput. Sci. 43: 228
Faulon J.-L., (1994) J. Chem. Inf. Comput. Sci. 34: 1204
Faulon J.L., Visco D.P. Jr., Pophale R.S., (2003) J. Chem. Inf. Comput. Sci. 43: 707
Faulon J.-L., Collins M.J., Carr R.D., (2004) J. Chem. Inf. Comput. Sci. 44: 427
Churchwell C.J., Rintoul M.D., Martin S., Visco D.P., Kotu A., Larson R.S., Sillerud L.O., Brown D.C., Faulon J.-L., (2004) J. Mol. Graph. Model 22: 263
Faulon J.-L., (1996) J. Chem. Inf. Comput. Sci. 36: 731
Faulon J.-L., Churchwell C.J., J.D.P.V. Jr., (2003) J. Chem. Inf. Comput. Sci. 43: 721
Sheridan R.P., Kearsley S.K., (1995) J. Chem. Inf. Comput. Sci. 35: 310
Venkatasubramanian V., Chen K., Caruthers J.M., (1995) J. Chem. Inf. Comput. Sci. 35: 188
Kvasnicka V., Pospichal J., (1996) J. Chem. Inf. Comput. Sci. 36: 516
Hall L.H., Dailey R.S., Kier L.B., (1993) J. Chem. Inf. Comput. Sci. 33: 598
Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 143
Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 148
Skvortsova M.I., Baskin I.I., Slovokhotova O.L., Palyulin V.A., Zefirov N.S., (1993) J. Chem. Inf. Comput. Sci. 33: 630
Godden J.W., Stahura F.L., Bajorath J., (2000) J. Chem. Inf. Comput. Sci. 40: 796
Cover, T.M. and Thomas, J.A., Elements of Information Theory. Wiley Series in Telecommunications, ed. Wiley. John Wiley & Sons, Inc., New York, 1991, 542 pp
Matlab 7. MathWorks, (2005)
Hawkins D.M., (2004) J. Chem. Inf. Comput. Sci. 44: 1
Joachims , T., In Scholkopf, B., Burges, C.J.C., Smola, A.J., (Eds.), Advances in Kernel Methods-Support Vector Learning MIT Press Cambridge, MA 169, 1999
Hart, W.E., SGOPT: A C++ Library of Global Optimization Methods. in IMSL. 1997
Faulon J.-L., (1992) J. Chem. Inf. Comput. Sci. 32: 338
PubChem. National Library of Medicine, (2005)
Bicerno, J., Prediction of Polymer Properties. 3rd Edtion. Marcel Dekker, New York, 2002
Zernov V.V., Balakin K.V., Ivaschenko A.A., Savchuk N.P., Pletnev I.V., (2003) J. Chem. Inf. Comput. Sci. 43: 2048
Momma, M. and Bennett, K.P., In SIAM Proceedings Series, Arlington, 2002
Quang, A.T., Zhang, Q.-L. and Xing, L., In Proceedings of the First International Conference on Machine Learning and Cybernetics, Beijing, 2002
Bender A., Mussa H.Y., Glen R.C., Reiling S., (2004) J. Chem. Inf. Comput. Sci. 44: 1708
Baskin I.I., Skvortsova M.I., Stankevich I.V., Zefirov N.S., (1995) J. Chem. Inf. Comput. Sci. 35: 527
Skvortsova M.I., Baskin I.I., Skvortsova L.A., Palyulin V.A., Stankevich I.V., Zefirov N.S., (1999) Theochem: J. Mol. Struct. 466: 211–217
Acknowledgements
This work was funded in part by the U.S. Department of Energy’s Genomics: GTL program (www.doegenomestolife.org) under project, “Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling” (www.genomes-to-life.org). This work was also funded by Sandia National Laboratories Computer Science Research Fund. Sandia is a multiprogram laboratory operated by Sandia Corporation, a LockheedMartin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Faulon, JL., Brown, W.M. & Martin, S. Reverse engineering chemical structures from molecular descriptors: how many solutions?. J Comput Aided Mol Des 19, 637–650 (2005). https://doi.org/10.1007/s10822-005-9007-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-005-9007-1