Keywords

1 Introduction

The principal goal of the drug design process is to find new chemical compounds that are able to modulate the activity of a given target in a desired way [13]. However, finding such molecules in the high-dimensional chemical space of all molecules without any prior knowledge is nearly impossible. In silico methods have been introduced to leverage the existing knowledge, thus forming a new branch of science - computer-aided drug design (CADD) [1, 12].

The recent advancements in deep learning have encouraged its application in CADD [4]. One of the main approaches is de novo design, that is using generative models to propose new molecules that are likely to possess the desired properties [3, 5, 15, 17].

In the center of our interest are the hit-to-lead and lead optimization phases of the compound design process. Their goal is to optimize the drug candidates identified in the previous steps in terms of the desired activity profile and their physicochemical and pharmacokinetic properties.

To address this problem, we introduce Mol-CycleGAN – a generative model based on CycleGAN [19]. Given a starting molecule, it generates a structurally similar one but with a desired characteristic. We show that our model generates molecules that possess desired properties while retaining their structural similarity to the starting compound. Moreover, thanks to employing graph-based representation, our algorithm always returns valid compounds.

To assess the model’s utility for compound design we evaluate its ability to maximize penalized logP property. Penalized logP is chosen because it is often selected as a testing ground for molecule optimization models [7, 18], due to its relevance in the drug design process. In the optimization of penalized logP for drug-like molecules our model significantly outperforms previous results. To the best of our knowledge, Mol-CycleGAN is the first approach to molecule generation that uses the CycleGAN architecture.

2 Mol-CycleGAN

Mol-CycleGAN is a novel method of performing compound optimization by learning from the sets of molecules with and without the desired molecular property (denoted by the sets X and Y). Our approach is to train a model to perform the transformation \(G: X \rightarrow Y\) (and \(F: Y \rightarrow X\)) which returns the optimized molecules. In the context of compound design X (Y) can be, e.g., the set of inactive (active) molecules.

To represent the sets X and Y our approach requires an embedding of molecules which is reversible, i.e. enables both encoding and decoding of molecules. For this purpose we use the latent space of Junction Tree Variational Autoencoder (JT-VAE) [7] – we represent each molecule as a point in the latent space, given by the mean of the variational encoding distribution [9]. This approach has the advantage that the distance between molecules (required to calculate the loss function) can be defined directly in the latent space.

Our model works as follows: (i) we define the sets X and Y (e.g., inactive/active molecules); (ii) we introduce the mapping functions \(G: X \rightarrow Y\) and \(F: Y \rightarrow X\); (iii) we introduce discriminator \(D_X\) (and \(D_Y\)) which forces the generator F (and G) to generate samples from a distribution close to the distribution of X (or Y). The components F, G, \(D_X\), and \(D_Y\) are modeled by neural networks (see subsect. 2.1 for technical details).

The main idea is to: (i) take the prior molecule x without a specified feature (e.g. activity) from set X, and compute its latent space embedding; (ii) use the generative neural network G to obtain the embedding of molecule G(x), that has this feature (as if the G(x) molecule came from set Y) but is also similar to the original molecule x; (iii) decode the latent space coordinates given by G(x) to obtain the optimized molecule. Thereby, the method is applicable in lead optimization processes, as the generated compound G(x) remains structurally similar to the input molecule.

To train the Mol-CycleGAN we use the following loss function:

$$\begin{aligned} \begin{aligned} L(G,F,D_X,D_Y)&= L_{\mathrm{GAN}}(G,D_Y,X,Y) + L_{\mathrm{GAN}}(F,D_X,Y,X)\\&+ \lambda _1 L_{\mathrm{cyc}}(G,F) + \lambda _2 L_{\mathrm{identity}}(G,F), \end{aligned} \end{aligned}$$
(1)

and aim to solve

$$\begin{aligned} G^*, F^* = \arg \min _{G, F} \max _{D_X, D_Y} L(G, F, D_X, D_Y). \end{aligned}$$
(2)

We use the adversarial loss introduced in LS-GAN [11]:

$$\begin{aligned} L_{\mathrm{GAN}}(G,D_Y,X,Y) = \frac{1}{2} \ \mathbb {E}_{y \sim p_{\mathrm{data}}(y)}[(D_Y(y) - 1)^2] + \frac{1}{2} \ \mathbb {E}_{x \sim p_{\mathrm{data}}(x)}[(D_Y(G(x)))^2], \end{aligned}$$
(3)

which ensures that the generator G (and F) generates samples from a distribution close to the distribution of Y (or X).

The cycle consistency loss:

$$\begin{aligned} L_{\mathrm{cyc}}(G,F) = \mathbb {E}_{y \sim p_{\mathrm{data}}(y)}[\Vert G(F(y)) - y \Vert _1] + \mathbb {E}_{x \sim p_{\mathrm{data}}(x)}[\Vert F(G(x)) - x \Vert _1], \end{aligned}$$
(4)

reduces the space of possible mapping functions, such that for a molecule x from set X, the GAN cycle brings it back to a molecule similar to x, i.e. F(G(x)) is close to x (and analogously G(F(y)) is close to y).

Finally, to ensure that the generated (optimized) molecule is close to the starting one, we use the identity mapping loss [19]:

$$\begin{aligned} L_{\mathrm{identity}}(G,F) = \mathbb {E}_{y \sim p_{\mathrm{data}}(y)}[\Vert F(y) - y \Vert _1] + \mathbb {E}_{x \sim p_{\mathrm{data}}(x)}[\Vert G(x) - x \Vert _1], \end{aligned}$$
(5)

which further reduces the space of possible mapping functions and prevents the model from generating molecules that lay far away from the starting molecule in the latent space of JT-VAE.

In our experiments, we use the hyperparameters \(\lambda _1 = 0.3\) and \(\lambda _2 = 0.1\). Note that these parameters control the balance between improvement in the optimized property and similarity between the generated and the starting molecule.

2.1 Workflow

We conduct experiments to test if the proposed model is able to generate molecules that are close to the starting ones and possess increased octanol-water partition coefficient (logP) penalized by the synthetic accessibility (SA) score. We optimize penalized logP, while constraining the degree of deviation from the starting molecule. The similarity between molecules is measured with Tanimoto similarity on Morgan Fingerprints [14].

We use the ZINC-250K dataset used in similar studies [7, 10] which contains 250000 drug-like molecules extracted from the ZINC database [16]. The sets \(X_{\text {train}}\) and \(Y_{\text {train}}\) are random samples of size 80000 from ZINC-250K, where the compounds’ penalized logP values are below and above the median, respectively. \(X_{\text {test}}\) is a separate, non-overlapping dataset, consisting of 800 molecules with the lowest values of penalized logP in ZINC-250K.

All networks are trained using the Adam optimizer [8] with learning rate 0.0001, batch normalization [6] and leaky-ReLU with \(\alpha = 0.1\). The models are trained for 300 epochs. Generators are built of four fully connected residual layers, with 56 units. Discriminators are built of 7 dense layers of the following sizes: 48, 36, 28, 18, 12, 7, 1 units.

3 Results

We optimize the penalized logP under the constraint that the similarity between the original and the generated molecule is higher than a fixed threshold (denoted as \(\delta \)). This is a realistic scenario in drug discovery, where the development of new drugs usually starts with known molecules such as existing drugs [2].

We maximize the penalized logP coefficient and use the Tanimoto similarity with the Morgan fingerprint to define the threshold of similarity. We compare our results with previous similar studies [7, 18].

Table 1. Results of the constrained optimization for JT-VAE [7], Graph Convolutional Policy Network (GCPN) [18] and Mol-CycleGAN.
Fig. 1.
figure 1

Molecules with the highest improvement of the penalized logP for \(\delta \ge 0.6\). In the top row we show the starting and in the bottom row the optimized molecules. Upper row numbers indicate Tanimoto similarities between the starting and the final molecule. The improvement in the score is given below the generated molecules.

In our optimization procedure, each molecule is fed into the generator to obtain the ‘optimized’ molecule G(x). The pair (xG(x)) defines an ’optimization path’ in the latent space of JT-VAE. To be able to make a comparison with the previous research [7] we start the procedure from the 800 molecules with the lowest values of penalized logP in ZINC-250K and then we decode molecules from 80 points along the path from x to G(x) in equal steps. From the resulting set of molecules we report the molecule with the highest penalized logP score that satisfies the similarity constraint. In the task of optimizing penalized logP of drug-like molecules, our method significantly outperforms the previous results in the mean improvement of the property (Table 1) and achieves a comparable mean similarity in the constrained scenario (for \(\delta > 0\)).

Molecules with highest improvement of logP are presented in Fig. 1 with the improvement given below the generated molecules.

Figure 2 shows starting and final molecules, together with all molecules generated along the optimization path and their values of penalized logP.

Fig. 2.
figure 2

Evolution of a selected exemplary molecule during constrained optimization. We only include the steps along the path where a change in the molecule is introduced. We show values of penalized logP below the molecules.

4 Conclusions

In this work, we introduce Mol-CycleGAN – a new model based on CycleGAN which can be used for the de novo generation of molecules. The advantage of the proposed model is the ability to learn transformation rules from the sets of compounds with desired and undesired values of the considered property. The model can generate molecules with desired properties, as shown on the example of penalized logP. The generated molecules are close to the starting ones and the degree of similarity can be controlled via a hyperparameter. In the task of constrained optimization of drug-like molecules our model significantly outperforms previous results.

The code used to produce the reported results can be found online at https://github.com/ardigen/mol-cycle-gan.