Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Quantitative structure-activity relationships (QSARs) are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically or pharmacologically relevant endpoints, which constitute the target outcomes of trials or experiments. The task is often tackled by instance-based methods (like k-nearest neighbors), which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two big families of chemical distance measures, fingerprint-based and maximum common subgaph based measures, provide orthogonal information about chemical similarity. The paper presents a novel method for finding suitable combinations of them, called adapted transfer, which adapts a distance measure learned on another, related dataset to a given dataset. Adapted transfer thus combines distance learning and transfer learning in a novel manner. In a set of experiments, we compare adapted transfer with distance learning on the target dataset itself and inductive transfer without adaptations. In our experiments, we visualize the performance of the methods by learning curves (i.e., depending on training set size) and present a quantitative comparison for 10% and 100% of the maximum training set size.