Reverse engineering chemical structures from molecular descriptors: how many solutions?
- 171 Downloads
Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.
Keywordsenumeration molecular fragments molecular design structure–properties relationships topological indices
Unable to display preview. Download preview PDF.
This work was funded in part by the U.S. Department of Energy’s Genomics: GTL program (www.doegenomestolife.org) under project, “Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling” (www.genomes-to-life.org). This work was also funded by Sandia National Laboratories Computer Science Research Fund. Sandia is a multiprogram laboratory operated by Sandia Corporation, a LockheedMartin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
- 5.Bonchev D., Trinajstic N., (1982) Int. J. Quantum Chem. 16: 463Google Scholar
- 21.Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 143Google Scholar
- 22.Kier L.B., Hall L.H., Frazer J.W., (1993) J. Chem. Inf. Comput. Sci. 33: 148Google Scholar
- 25.Cover, T.M. and Thomas, J.A., Elements of Information Theory. Wiley Series in Telecommunications, ed. Wiley. John Wiley & Sons, Inc., New York, 1991, 542 ppGoogle Scholar
- 26.Matlab 7. MathWorks, (2005)Google Scholar
- 28.Joachims , T., In Scholkopf, B., Burges, C.J.C., Smola, A.J., (Eds.), Advances in Kernel Methods-Support Vector Learning MIT Press Cambridge, MA 169, 1999Google Scholar
- 29.Hart, W.E., SGOPT: A C++ Library of Global Optimization Methods. in IMSL. 1997Google Scholar
- 30.Faulon J.-L., (1992) J. Chem. Inf. Comput. Sci. 32: 338Google Scholar
- 31.PubChem. National Library of Medicine, (2005)Google Scholar
- 32.Bicerno, J., Prediction of Polymer Properties. 3rd Edtion. Marcel Dekker, New York, 2002Google Scholar
- 34.Momma, M. and Bennett, K.P., In SIAM Proceedings Series, Arlington, 2002Google Scholar
- 35.Quang, A.T., Zhang, Q.-L. and Xing, L., In Proceedings of the First International Conference on Machine Learning and Cybernetics, Beijing, 2002Google Scholar