1 Background

Malaria accounts for most of the deaths in the world especially in children and pregnant women. This infection affected about 219 million persons in 2017 where 92% of this number was from WHO African countries [1]. And it claimed about 435,000 deaths worldwide in 2017 where 61% (266 000) was from children below the age of 5 years [1]. Five species of Plasmodium parasites cause malaria in humans, and these species are P. falciparum, P. vivax, P. malariae, P. ovale, and P. knowlesi [2]. Plasmodium falciparum is the most prevalent and the most deadly of the five species in the WHO regions of Africa, South-East Asia, Eastern Mediterranean, and the Western Pacific, which is responsible for 99.7%, 62.8%, 69%, and 71.9%, respectively, of the malaria cases in 2017, while P. vivax is the most prevalent in the WHO region of the America which is responsible for 74.1% [1, 2]. Medicinal chemists tested numerous compounds against Plasmodium parasites to find their most efficient inhibitors. Aminoquinolines such as chloroquine has been used for several decades as the first-line antimalarial drug [3]. Though its efficacy has been diminished due to Plasmodium falciparum resistance [4, 5], it remained efficacious in some Caribbean countries and Central America [6]. Since 2005, the World Health Organization recommended the use of artemisinin-based combination therapies (ACTs) for the treatment of P. falciparum malaria [7, 8]. These chemotherapies show excellent efficacy especially in the African region. But, the development and spread of parasites resistance to any antimalarial drug are very likely as experience with other antimalarial drugs [9]. Furthermore, resistance to ACTs has been observed and reported to be of an increase in Southeast Asia and its spread to other regions is seriously challenging [10,11,12], hence, the need for a promising antimalarial drugs.

Combinational therapies like ACTs are costly and have more toxic side effects than single drugs due to drug-drug interaction [13]. The alternatives to ACTs are hybrid compounds and molecules containing two or more active pharmacophores that can act simultaneously on two or more molecular targets [14,15,16,17]. Such molecules are active against erythrocytic and live stages of malaria infection; therefore, they can help in fighting resistance and meeting the agenda of eradicating malaria [18]. Nowadays, the search for antimalarials focuses on hybrid compounds containing quinoline which is one of the important pharmacophore acting against malaria [19,20,21].

Conventional drug discovery methods are expensive and time-consuming requiring the sacrifice of animals or compounds in their pure forms [22]. Effective and efficient techniques that can screen chemical databases of molecules with known activities against a particular infection are necessary [23]. Quantitative structure-activity relationship (QSAR) modeling and molecular docking studies have been successfully used in the development of drugs as cost- and time-effective techniques [24, 25]. QSAR is a significant modeling method for structural optimization and drug design [23, 26]. Herein, we conducted a QSAR study of tetraoxane-8-aminoquinoline hybrids as dual-stage antimalarial agents to produce a model that could be used to design new potent antimalaria therapy. We also carried out a molecular docking study of the hybrid compounds with Plasmodium falciparum lactate dehydrogenase (pfLDH) enzyme to investigate the interaction of the hybrids with potential target enzyme. Tetraoxane-8-aminoquinoline hybrids were reported to be metabolically stable and active against both erythrocytic and liver-stage malaria parasites [21].

2 Methods

2.1 Data collection

Twenty-two compounds of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids and their in vitro antimalarial activities (EC50) against intraerythrocytic P. falciparum W2 strain were obtained from the paper published by Capela and coworkers [21] and used herein. The antiplasmodial activities of the compounds reported in EC50 (μM) were transformed to pEC50 (pEC50 = − logEC50) for the purpose of this research. Structures of the molecules and their activities were presented in Table 1.

Table 1 Molecular structures of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids and their antimalarial activities

2.2 Geometric optimization

The structures of the molecules shown in Table 1 were drawn and optimized using the ChemDraw version 12.0.2 software [27] and Spartan 14 Version 1.1.4 software with semi-empirical (PM3) quantum mechanics method [28], respectively.

2.3 Molecular descriptor calculation

A total of 1875 molecular descriptors of the optimized molecules of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids were computed with PaDEL-Descriptor software version 2.20 [29].

2.4 Normalization and data pretreatment

Using Eq. (1), the obtained descriptors were normalized so that each variable will have equal opportunity in influencing the construction of a good model [30].

$$ X=\frac{X_{\mathrm{i}}-{X}_{\mathrm{min}}}{X_{\mathrm{max}}-{X}_{\mathrm{min}}} $$
(1)

where X is the normalized descriptors, Xi is the descriptor’s value for each molecule, Xmin and Xmax are the minimum and maximum value for each descriptor. To eliminate redundancy in the normalized data, it was then pretreated using the Data Pretreatment software gotten from Drug Theoretical and Cheminformatics Laboratory (DTC Lab).

2.5 Data division

Kenard and Stone’s algorithm [31] was employed to divide the pretreated data into a training set (70%) for model generation and a test set (30%) for external validation of the model. This was achieved using the Data Division software gotten from Drug Theoretical and Cheminformatics Laboratory (DTC Lab).

2.6 Model generation

Using the genetic function approximation (GFA) technique in the Material Studio software, regression analysis was carried out to generate the model (using training set) with the activities in pEC50 as the dependent variable and the descriptors of independent variable.

2.7 Internal validation of the model generated

The model generated was assessed using Friedman formula [32] and defined as:

$$ LOF=\frac{\mathrm{SEE}}{{\left(1-\frac{c+ dp}{M}\right)}^2} $$
(2)

where LOF is the Friedman’s lack fit (a measure of fitness of a model), SEE is the standard error of estimation, p is the total number of descriptors in the model, d is the user-defined smoothing parameter, c is the number of terms in the model, and M is the number of compound in the training set.

SEE is defined as:

$$ SEE=\sqrt{\frac{\left({Y}_{\mathrm{exp}}-Y\mathrm{prd}\right)}{N-P-1}} $$
(3)

which is the same as the standard deviation of the model and if its value is low, a model is said to be good.

The correlation coefficient R2 of a built model is another parameter considered and the closer it is to 1.0, the better the model is built. R2 is expressed as:

$$ {R}^2=1-\sqrt{\frac{\sum {\left({Y}_{\mathrm{exp}}-{Y}_{\mathrm{prd}}\right)}^2}{\sum {\left({Y}_{\mathrm{exp}}-{Y}_{\mathrm{mtrn}}\right)}^2}} $$
(4)

where Yprd, Yexp, and Ymtrn are the predicted, experimental, and mean experimental activities in the training set, respectively.

The value of R2 is directly proportional to the number of descriptors; hence, the stability of the model is not reliable on it. Thus, to have a reliable and stable model, R2 is adjusted according to the expression:

$$ {R}_{\mathrm{adj}}^2=\frac{\left(n-1\right)\left({R}^2-p\right)}{n-p-1} $$
(5)

where p is the number of descriptors in the model and n is the number of compounds used in the training set.

The cross-validation coefficient, Q2cv, is expressed as:

$$ {\mathrm{Q}}_{\mathrm{cv}}^2=1-\frac{\sum {\left({Y}_{prd}-{Y}_{\mathrm{exp}}\right)}^2}{\sum {\left({Y}_{\mathrm{exp}}-{Y}_{mtrn}\right)}^2} $$
(6)

where Yprd, Yexp, and Ymtrn are the predicted, experimental, and average experimental activity in the training set, respectively.

2.8 External validation of the model generated

The generated model was assessed (using test set) for external validation by the value of R2test expressed as:

$$ {R}_{test}^2=1-\frac{\sum {\left({Y}_{\mathrm{prd}}-{Y}_{\mathrm{exp}}\right)}^2}{\sum {\left({Y}_{\mathrm{exp}}-{Y}_{\mathrm{mtrn}}\right)}^2} $$
(7)

where Yprd and Yexp are respectively the predicted and experimental activities of the test set, and Ymtrn is the mean experimental activity of the training set. The closer the value is to 1.0, the better the model generated [33].

2.9 Y-randomization test

Random multi-linear regression models were generated (using training set) in the Y-randomization test whose average R2 and Q2 values have to be low for the QSAR model to be robust [33]. Coefficient of determination, cR2p, whose value has to be greater than 0.5 for passing this test is also calculated in the Y-randomization test and is expressed as:

$$ c{R}_p^2= Rx{\left({R}^2-{R}_r^2\right)}^2 $$
(8)

where R is the correlation coefficient for Y-randomization and R2r is the average “R” of the random models.

2.10 Applicability domain of the generated model

Leverage (hi) method was used in describing the applicability domain of the QSAR models [34]; and for a chemical compound, it is expressed as:

$$ {h}_{\mathrm{i}}={X}_{\mathrm{i}}{\left({X}^{\mathrm{T}}X\right)}^{-1}{X}_{\mathrm{I}}^{\mathrm{T}} $$
(9)

where Xi is training compounds matrix of i. X is the n × k descriptor matrix of the training set compound, and XT is the transpose matrix of X used to generate the model. The warning leverage, h*, is the maximum value for X and is expressed as:

$$ {h}^{\ast }=\frac{3\left(p+1\right)}{n} $$
(10)

where n is the number of training compounds and p is the number of descriptors in the model.

2.11 Quality assurance of the generated model

Internal and external validation parameters presented in Table 2 give the minimum required values for a QSAR model to be predictable and reliable [34].

Table 2 The minimum required values for a QSAR model to be generally acceptable

2.12 Docking study

To elucidate the interaction of 1,2,4,5-tetraoxane-8-aminoquinolines with a possible molecular target, the molecular docking study of the hybrid compounds was conducted with Plasmodium PfLDH which is a potential target enzyme for antimalarials because the parasite relies on glycolysis to produce energy [35]. The Discovery Studio software was used to prepare the crystal structure of the enzyme obtained from protein data bank (PDB ID: 1CET) as the receptor, and the compounds are prepared as the ligands. Autodock Vina in the Pyrx software was used to dock the receptor and the ligands [36]. The docking result was visualized and analyzed with the aid of Discovery Studio Visualizer.

3 Results

Genetic function algorithm (GFA) of the material studio software was used to build four QSAR models to study how the chemical structure of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids relates with their biological activities as potent antimalaria. One of the built models was selected for its statistical significance and reported herein as follows:

pEC50 = 33.566456798 * MATS3m

− 18.570253404 * GATS8p

+ 16.287782272 * GATS8i

+ 0.044070689 * RDF50s

+ 6.676939310

Table 3 presents the validation parameters of the model which satisfied the minimum required values presented in Table 2.

Table 3 Validation parameters for the selected model

4 Discussion

The model contained 2D autocorrelation descriptors (MATS3m, GATS8p, and GATS8i) and radial distribution function (RDF50s) descriptor. MATS3m was Moran autocorrelation of lag 3 weighted by atomic masses, GATS8p, and GATS8i were Geary autocorrelation of lag 8 weighted by atomic polarizabilities and first ionization potential, respectively. The 2D autocorrelation descriptors explained how the values of certain functions (topological distance) at intervals equal to the lag (atomic properties) were correlated. These descriptors of type GATSd and MATSd are slightly different but generally describe how the considered property was distributed along the topological structure [37, 38]. RDF50s was 3D radial distribution function at 5.0 inter-atomic distance weighted by relative I-state. RDF-type descriptors of a molecule indicate the probability distribution of finding an atom in a spherical volume of radius R [39]. RDF50s indicated the existence of a linear relationship between the antiplasmodial activities of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids and the 3D molecular distribution of the relative inductive effect of atoms or group of atoms in the molecules calculated at the radius of 5.0 Å from the geometrical centers of each hybrids molecule.

Table 4 shows the experimental and predicted activities of 1,2,4,5-tetraoxane-8-aminoquinoline hybrids as potent multidrug-resistant Plasmodium falciparum W2 strain inhibitors with the residual values. The high predictability of the model was indicated by the low residual value between the experimental and predicted activity of the compounds.

Table 4 Experimental and predicted activities for the compounds with residual

Pearson’s correlation matrix, variance inflation (VIF) factor, and mean effect (ME) of the four descriptors in the model were presented in Table 5. The correlation matrix shows no significant inter-correlation among the descriptors used in building the model as corroborated by the VIF values which were less than 10 for all the descriptors. Hence, the descriptors used in building the model were good, and the model is stable without multi-co-linearity problem. The ME indicates the magnitudes and directions of influence of the descriptors on the antiplasmodial activities of the compounds. The descriptors MATS3m, GATS8i, and RDF50s with positive sign ME values vary directly with the activities of the molecules, while the descriptor GATS8p with negative sign ME values varies inversely with the activities of the molecules. The descriptor ME magnitudes indicated the extent of their respective influences where GATS8p had a greater influence on the antiplasmodial activities of the compounds. Y-randomization test result presented in Table 6 confirmed that the built QSAR model was reliable, robust, and stable for the low R2 and Q2 values for several trials. The result also shows that the model is good and not gotten by chance for the value of cR2p (> 0.5).

Table 5 Pearson’s correlation, variance inflation factor (VIF), and mean effect (ME) of descriptors used in the selected model
Table 6 Y-randomization test result

Figure 1 presents the plot of predicted activity against the experimental activity of both training and test set. The linearity of this plot indicated the high predictive power of the built model. The plot of standardized residual against experimental activity presented in Fig. 2 shows the dispersal of standardized residual values on both sides of zero; hence, there was no systematic error in the generated model [40]. Figure 3 shows the Williams plot of the standardized residuals versus the leverages. It was clear that all compounds were within the applicability domain and have no influential compounds. This implies that any of the compounds can be used in designing new 1,2,4,5-tetraoxane-8-aminoquinoline hybrids with highly potent antiplasmodial activities.

Fig. 1
figure 1

Plot of predicted activity against experimental activity of both training and test set

Fig. 2
figure 2

Plot of standardized residual activity against experimental activity

Fig. 3
figure 3

Plot of the standardized residuals against the leverages (Williams plot).

Table 7 presents the result of the molecular docking study carried out between PfLDH (receptor) and 1,2,4,5-tetraoxane-8-aminoquinoline hybrid compounds (ligands). The result shows strong interactions of the ligands with the active sites of the receptor with binding affinities ranging from − 6.3 to − 10.9 kcal/mol having important hydrogen bonding and hydrophobic interaction with the amino acids of the protein. The binding affinities of all the hybrids are better than that of chloroquine. Figure 4 shows the 2D and 3D interaction of ligand 22 with the receptor. This interaction had the best binding affinity of − 10.9 kcal/mol containing two conventional hydrogen bonds, one of which was between one of the oxygen atom of the tetraoxane moiety as the H-acceptor and ARG109 residue as the H-donor, and the other was between the NH of the quinoline moiety as the H-donor and the residue PRO246 as the H-acceptor. The interaction also contains a carbon-hydrogen bond between the methoxide carbon atom of the quinoline moiety as the H-donor and ASN140 residue as the H-acceptor. The ligand also formed two hydrophobic interactions of alkyl-alkyl type with the ALA236 amino acid of the receptor and halogen-type interaction between ASP53 and bromine atom of the ligand. The hydrogen bond and the hydrophobic interactions of ligand 22 with the receptor are depicted in Figs. 5 and 6, respectively.

Table 7 Docking result between pfLDH and the selected 1,2,4,5-tetraoxane-8-aminoquinoline hybrids
Fig. 4
figure 4

3D and 2D 22-pfLDH interactions

Fig. 5
figure 5

Hydrogen bond 22-pfLDH interactions

Fig. 6
figure 6

Hydrophobic 22-pfLDH interactions

5 Conclusion

QSAR and molecular docking studies were conducted on 1,2,4,5-tetraoxane-8-aminoquinoline hybrids as potent antimalaria. A stable, reliable, and robust model was generated and found to be influenced by MATS3m, GATS8p, GATS8i, and RDF50s descriptors. MATS3m, GATS8i, and RDF50s were found to influence the antiplasmodial activities of the compounds positively while GATS8p negatively with the greatest influence. The molecular docking study revealed the mode of interaction of the hybrid compounds with Plasmodium falciparum lactate dehydrogenase as the potential target. The result shows strong interaction of the compounds with the receptors. The QSAR model couple with the docking result can be employed in designing new 1,2,4,5-tetraoxane-8-aminoquinoline hybrids with highly potent activities against Plasmodium falciparum.