Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships

  • Ulrich Rückert
  • Tobias Girschick
  • Fabian Buchwald
  • Stefan Kramer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6332)

Abstract

Quantitative structure-activity relationships (QSARs) are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically or pharmacologically relevant endpoints, which constitute the target outcomes of trials or experiments. The task is often tackled by instance-based methods (like k-nearest neighbors), which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two big families of chemical distance measures, fingerprint-based and maximum common subgaph based measures, provide orthogonal information about chemical similarity. The paper presents a novel method for finding suitable combinations of them, called adapted transfer, which adapts a distance measure learned on another, related dataset to a given dataset. Adapted transfer thus combines distance learning and transfer learning in a novel manner. In a set of experiments, we compare adapted transfer with distance learning on the target dataset itself and inductive transfer without adaptations. In our experiments, we visualize the performance of the methods by learning curves (i.e., depending on training set size) and present a quantitative comparison for 10% and 100% of the maximum training set size.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Horváth, T., Gärtner, T., Wrobel, S.: Cyclic pattern kernels for predictive graph mining. In: Proc. of KDD 2004, pp. 158–167. ACM Press, New York (2004)Google Scholar
  2. 2.
    Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., Borgwardt, K.: Efficient Graphlet Kernels for Large Graph Comparison. In: Proc. of AISTATS 2009 (2009)Google Scholar
  3. 3.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighborhood Component Analysis. In: Proc. of NIPS 2004, pp. 513–520 (2005)Google Scholar
  4. 4.
    Eaton, E., Desjardins, M., Lane, T.: Modeling transfer relationships between learning tasks for improved inductive transfer. In: Proc. of ECML PKDD 2008, pp. 317–332. Springer, Heidelberg (2008)Google Scholar
  5. 5.
    Raymond, J.W., Willett, P.: Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. JCAMD, 59–71 (January 2002)Google Scholar
  6. 6.
    Sutherland, J.J., O’Brien, L.A., Weaver, D.F.: Spline-fitting with a genetic algorithm: A method for developing classification structure-activity relationships. J. Chem. Inf. Model 43(6), 1906–1915 (2003)Google Scholar
  7. 7.
    Sutherland, J.J., O’Brien, L.A., Weaver, D.F.: A comparison of methods for modeling quantitative structure-activity relationships. J. Med. Chem. 47(22), 5541–5554 (2004)CrossRefGoogle Scholar
  8. 8.
    Benigni, R., Bossa, C., Vari, M.R.: Chemical carcinogens: Structures and experimental data, http://www.iss.it/binary/ampp/cont/ISSCANv2aEn.1134647480.pdf
  9. 9.
    Rückert, U., Kramer, S.: Frequent free tree discovery in graph data. In: SAC 2004, pp. 564–570. ACM Press, New York (2004)Google Scholar
  10. 10.
    Woznica, A., Kalousis, A., Hilario, M.: Learning to combine distances for complex representations. In: Proc. of ICML 2007, pp. 1031–1038. ACM Press, New York (2007)Google Scholar
  11. 11.
    Hillel, A.B., Weinshall, D.: Learning distance function by coding similarity. In: Proc. of ICML 2007, pp. 65–72. ACM Press, New York (2007)Google Scholar
  12. 12.
    Weinberger, K.Q., Tesauro, G.: Metric learning for kernel regression. In: Proc. of AISTATS 2007 (2007)Google Scholar
  13. 13.
    Baxter, J.: Learning Internal Representations. In: Proc. COLT 1995, pp. 311–320. ACM Press, New York (1995)Google Scholar
  14. 14.
    Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. J. Mach. Learn. Res. 6, 615–637 (2005)MathSciNetMATHGoogle Scholar
  15. 15.
    Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Lear. Res. 7, 1531–1565 (2006)MathSciNetMATHGoogle Scholar
  16. 16.
    Neuhaus, M., Bunke, H.: Bridging the Gap Between Graph Edit Distance and Kernel Machines. World Scientific Publishing Co., Inc, Singapore (2007)CrossRefMATHGoogle Scholar
  17. 17.
    Zha, Z.J., Mei, T., Wang, M., Wang, Z., Hua, X.S.: Robust distance metric learning with auxiliary knowledge. In: Proc. of IJCAI 2009, pp. 1327–1332 (2009)Google Scholar
  18. 18.
    Erhan, D., Bengio, Y., L’Heureux, P.J., Yue, S.Y.: Generalizing to a zero-data task: a computational chemistry case study. Technical Report 1286, Département d’informatique et recherche opérationnelle, University of Montreal (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Ulrich Rückert
    • 1
  • Tobias Girschick
    • 2
  • Fabian Buchwald
    • 2
  • Stefan Kramer
    • 2
  1. 1.International Computer Science InstituteBerkeleyUSA
  2. 2.Institut für Informatik/I12Technische Universität MünchenGarching b. MünchenGermany

Personalised recommendations