Abstract
Accurately predicting the endpoints of chemical compounds is an important step towards drug design and molecular screening in particular.
Here we develop a recursive architecture that is capable of mapping Undirected Graphs into individual labels, and apply it to the prediction of a number of different properties of small molecules. The results we obtain are generally state-of-the-art.
The final model is completely general and may be applied not only to prediction of molecular properties, but to a vast range of problems in which the input is a graph and the output is either a single property or (with small modifications) a set of properties of the nodes.
Chapter PDF
References
Hansch, C., Muir, R.M., Fujita, T., Maloney, P., Geiger, E., Streich, M.: The correlation of biological activity of plant growth regulators and chloromycetin derivatives with hammett constants and partition coefficients. J. Am. Chem. Soc. 85, 2817 (1963)
Delaney, J.: Esol: Estimating aqueous solubility directly from molecular structure. J. Chem. Inf. Comput. Sci. 44(3), 1000–1005 (2004)
Huuskonen, J.: Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology. J. Chem. Inf. Comput. Sci. 40(3), 773–777 (2000)
Fröhlich, H., Wegner, J., Zell, A.: Towards optimal descriptor subset selection with support vector machines in classification and regression. J. Chem. Inf. Comput. Sci. 45(3), 581–590 (2005)
Karthikeyan, M.: General melting point prediction based on a diverse compound data set and artificial neural networks. J. Chem. Inf. Comput. Sci. 45(3), 581–590 (2005)
Wang, R., Fu, Y., Lai, L.: Towards optimal descriptor subset selection with support vector machines in classification and regression. J. Chem. Inf. Comput. Sci. 37(3), 615–621 (1997)
Kazius, J., McGuire, R., Bursi, R.: Derivation and validation of toxicophores for mutagenicity prediction. J. Med. Chem. 48(1), 312–320 (2005)
Kazius, J., Nijssen, S., Kok, J., Bäck, T., Ijzerman, A.: Substructure mining using elaborate chemical representation. J. Chem. Inf. Model. 46(2), 597–605 (2006)
Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Transactions on Knowledge and Data Engineering 17(8), 1036–1050 (2005)
Benigni, R., Giuliani, A.: Putting the predictive toxicology challenge into perspective: reflections on the results. Bioinformatics 19(10), 1194–1200 (2003)
Mahé, P., Ueda, N., Akutsu, T., Perret, J., Vert, J.: Graph kernels for molecular structure-activity relationship analysis with support vector machines. Journal of Chemical Information and Modeling 45, 939–951 (2005)
Azencott, C., Ksikes, A., Swamidass, A., Chen, J., Ralaivola, L., Baldi, P.: One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties. J. Chem. Inf. Comput. Sci. 47(3), 965–974 (2007)
Ceroni, A., Costa, F., Frasconi, P.: Classification of small molecules by two- and three-dimensional decomposition kernels. Bioinformatics 23(16), 2038–2045 (2007)
Swamidass, S., Chen, J., Bruand, J., Phung, P., Ralaivola, L., Baldi, P.: Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21(suppl. 1), 359–368 (2005)
Micheli, A., Sperduti, A., Starita, A.: An introduction to recursive neural networks and kernel methods for cheminformatics. Current Pharmaceutical Design 13(14), 1469–1495 (2007)
Sperduti, A., Starita, A.: Supervised neural networks for the classification of structures. IEEE Transactions on Neural Networks 8(3), 714–735 (1997)
Frasconi, P.: An introduction to learning structured information. J. Chem. Inf. Comput. Sci. 1387/1998, 99 (2004)
Frasconi, P., Gori, M., Sperduti, A.: A general framework for adaptive processing of data structures. IEEE Transactions on Neural Networks 9(5), 768–786 (1998)
Bernazzani, L., Duce, C., Micheli, A., Mollica, V., Sperduti, A., Starita, A., Tiné, M.: Predicting physical-chemical properties of compounds from molecular structures by recursive neural networks. Applied Intelligence 19(1-2), 9–25 (2003)
Micheli, A., Portera, F., Sperduti, A.: QSAR/QSPR studies by kernel machines, recursive neural networks and their integration. In: Apolloni, B., Marinaro, M., Tagliaferri, R. (eds.) WIRN 2003. LNCS, vol. 2859, pp. 308–315. Springer, Heidelberg (2003)
Bianucci, A., Micheli, A., Sperduti, A., Starita, A.: Application of cascade correlation networks for structures to chemistry. Applied Intelligence 12(1-2), 117–147 (2000)
Siu-Yeung, C., Zheru, C.: Genetic evolution processing of data structures for image classification. IEEE Transactions on Knowledge and Data Engineering 17(2), 216–231 (2005)
Costa, F., Frasconi, P., Lombardo, V., Soda, G.: Towards incremental parsing of natural language using recursive neural networks. Applied Intelligence 19(1-2), 9–25 (2003)
Bianchini, M., Maggini, M., Sarti, L., Scarselli, F.: Recursive neural networks learn to localize faces. Pattern Recognition Letters 26(12), 1885–1895 (2005)
Zheng, M., Liu, Z., Xue, C., Zhu, W., Chen, K., Luo, X., Jiang, H.: Mutagenic probability estimation of chemical compounds by a novel molecular electrophilicity vector and support vector machine. Bioinformatics 22(17), 2099–2106 (2006)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2), 157–166 (1994)
The open babel package version 2.1.1, http://www.openbabel.org/
Huuskonen, J.: Estimation of aqueous solubility in drug design. Combinatorial Chemistry and High Throughput Screening 4(3), 311–316 (2000)
Butina, D., Gola, J.: Modeling aqueous solubility. J. Chem. Inf. Comput. Sci. 43, 837–841 (2003)
Jain, N., Yalkowsky, S.: Estimation of the aqueous solubility i: Application to organic nonelectrolytes. Journal of Pharmaceutical Sciences 90(2), 234–252 (2001)
Abramowitz, R., Yalkowsky, S.: Melting point, boiling point, and symmetry. Pharmaceutical Research 7(9), 942–947 (1990)
Molecular diversity preservation international database, http://www.mdpi.org/
Mortelmans, K., Zeiger, E.: The ames salmonella/microsome mutagenicity assay. Mutat. Res. 455(1-2), 29–60 (2000)
Helma, C., Cramer, T., Kramer, S., De Raedt, L.: Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. J. Chem. Inf. Comput. Sci. 44(4), 1402–1411 (2004)
Piegorsch, W., Zeiger, E.: Measuring intra-assay agreement for the ames salmonella assay. Statistical Methods in Toxicology. Lect. Notes Med. Informatics 43, 35–41 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Walsh, I., Vullo, A., Pollastri, G. (2009). Recursive Neural Networks for Undirected Graphs for Learning Molecular Endpoints. In: Kadirkamanathan, V., Sanguinetti, G., Girolami, M., Niranjan, M., Noirel, J. (eds) Pattern Recognition in Bioinformatics. PRIB 2009. Lecture Notes in Computer Science(), vol 5780. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04031-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-04031-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04030-6
Online ISBN: 978-3-642-04031-3
eBook Packages: Computer ScienceComputer Science (R0)