Abstract
Quantitative structure-activity relationships (QSARs) are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically or pharmacologically relevant endpoints, which constitute the target outcomes of trials or experiments. The task is often tackled by instance-based methods (like k-nearest neighbors), which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two big families of chemical distance measures, fingerprint-based and maximum common subgaph based measures, provide orthogonal information about chemical similarity. The paper presents a novel method for finding suitable combinations of them, called adapted transfer, which adapts a distance measure learned on another, related dataset to a given dataset. Adapted transfer thus combines distance learning and transfer learning in a novel manner. In a set of experiments, we compare adapted transfer with distance learning on the target dataset itself and inductive transfer without adaptations. In our experiments, we visualize the performance of the methods by learning curves (i.e., depending on training set size) and present a quantitative comparison for 10% and 100% of the maximum training set size.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Horváth, T., Gärtner, T., Wrobel, S.: Cyclic pattern kernels for predictive graph mining. In: Proc. of KDD 2004, pp. 158–167. ACM Press, New York (2004)
Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., Borgwardt, K.: Efficient Graphlet Kernels for Large Graph Comparison. In: Proc. of AISTATS 2009 (2009)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighborhood Component Analysis. In: Proc. of NIPS 2004, pp. 513–520 (2005)
Eaton, E., Desjardins, M., Lane, T.: Modeling transfer relationships between learning tasks for improved inductive transfer. In: Proc. of ECML PKDD 2008, pp. 317–332. Springer, Heidelberg (2008)
Raymond, J.W., Willett, P.: Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. JCAMD, 59–71 (January 2002)
Sutherland, J.J., O’Brien, L.A., Weaver, D.F.: Spline-fitting with a genetic algorithm: A method for developing classification structure-activity relationships. J. Chem. Inf. Model 43(6), 1906–1915 (2003)
Sutherland, J.J., O’Brien, L.A., Weaver, D.F.: A comparison of methods for modeling quantitative structure-activity relationships. J. Med. Chem. 47(22), 5541–5554 (2004)
Benigni, R., Bossa, C., Vari, M.R.: Chemical carcinogens: Structures and experimental data, http://www.iss.it/binary/ampp/cont/ISSCANv2aEn.1134647480.pdf
Rückert, U., Kramer, S.: Frequent free tree discovery in graph data. In: SAC 2004, pp. 564–570. ACM Press, New York (2004)
Woznica, A., Kalousis, A., Hilario, M.: Learning to combine distances for complex representations. In: Proc. of ICML 2007, pp. 1031–1038. ACM Press, New York (2007)
Hillel, A.B., Weinshall, D.: Learning distance function by coding similarity. In: Proc. of ICML 2007, pp. 65–72. ACM Press, New York (2007)
Weinberger, K.Q., Tesauro, G.: Metric learning for kernel regression. In: Proc. of AISTATS 2007 (2007)
Baxter, J.: Learning Internal Representations. In: Proc. COLT 1995, pp. 311–320. ACM Press, New York (1995)
Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. J. Mach. Learn. Res. 6, 615–637 (2005)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Lear. Res. 7, 1531–1565 (2006)
Neuhaus, M., Bunke, H.: Bridging the Gap Between Graph Edit Distance and Kernel Machines. World Scientific Publishing Co., Inc, Singapore (2007)
Zha, Z.J., Mei, T., Wang, M., Wang, Z., Hua, X.S.: Robust distance metric learning with auxiliary knowledge. In: Proc. of IJCAI 2009, pp. 1327–1332 (2009)
Erhan, D., Bengio, Y., L’Heureux, P.J., Yue, S.Y.: Generalizing to a zero-data task: a computational chemistry case study. Technical Report 1286, Département d’informatique et recherche opérationnelle, University of Montreal (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rückert, U., Girschick, T., Buchwald, F., Kramer, S. (2010). Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-16184-1_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16183-4
Online ISBN: 978-3-642-16184-1
eBook Packages: Computer ScienceComputer Science (R0)