Background

The biochemistry we observe in life on Earth is an island in the chemical space of possible biochemistry. Not all possible small organic molecules are made by life, and the chemicals making up the metabolic pathways common to life are limited to a small number of classes of chemicals – aldehydes, polyols, amines, alpha amino acids etc.. Understanding why biochemistry uses the molecules that it does is central both to engineering biochemistry to produce useful products and to understanding how terrestrial biochemistry originated. Is the restriction on the observed chemistry of life simply because life has not evolved the catalysts needed to make other molecules, because life has not found a need for them, or because there is selection against chemistry outside ‘biochemical space’?

It is plausible to suggest that life simply has not invented the means to make some classes of chemicals. We know that life makes carbon-carbon bonds using aldol condensation and not (for example) metathesis [1, 2], although metathesis enzymes can be designed in principle [3]. There may simply not be any functional reason for making some molecules driving the evolution of the relevant enzymatic mechanisms.

There may also be limits on what biochemistry can achieve outside those imposed by catalytic mechanisms and the function of metabolites. For example, it has previously been shown that a simple measure of the degree of saturation of a molecule may be used to indicate that molecule’s toxicity, in the absence of any other structural information about the molecule, a finding that is related to the distribution of biochemicals in chemical space [4]. It would be surprising if this were the only such constraint on the molecules of life.

In this paper I present evidence that there is selection against the incorporation of chemicals that contain structural features not found in central metabolism – chemicals that I term ‘Unbiological’ – into metabolism, separate from the constraints provided by selection for specific function and the ability of life to catalyse specific types of reaction. Specifically, the sections below argue that:

  1. i)

    the chemical space of the biochemicals that are common to life on Earth is a small subset of the chemical space possible to the chemistry of life ('Biochemistry occupies a limited chemical space').

  2. ii)

    that chemicals outside biochemical space have a higher chance of being toxic at millimolar concentrations than chemicals that fall inside biochemical space ('Mild toxicity is correlated with ‘UnBiological’ chemical characteristics' thru 'Threshold for correlations is millimolar concentration').

  3. iii)

    that a wide range of experimental data suggests that many small molecules bind to many proteins with low millimolar affinity, which provides a mechanism for the toxicity of chemicals at millimolar concentrations ('Mechanism of Ub correlation with toxicity')

  4. iv)

    that the reason for correlation of the toxicity of chemicals and their distance from biochemistry is that life has systematically evolved proteins to avoid unwanted millimolar interactions with metabolites in order to avoid poisoning itself ('Proposed mechanism of correlation of Ub with millimolar toxicity').

The results in the paper are in two parts to reflect this reasoning. The Results and discussion sections (Figs. 2, 4 and 5) describes the chemical space of life and the low level toxicity of chemicals falling outside this space. The sections on Mechanism of Ub correlation with toxicity and Proposed mechanism of correlation of Ub with millimolar toxicity (Figs. 6 and 7) provides an explanation for this effect.

These results suggest that biochemistry is more of an integrated whole than the conventional metabolic map would suggest. This has theoretical and practical implications, which I discuss briefly at the end of the paper.

Results and discussion

Biochemistry occupies a limited chemical space

I first establish that biochemical space is a relatively small subset of the possible chemical space from which metabolism could be selected. It is a commonplace that many of the components of primary metabolism “look similar to each other” (as undergraduates learning how to distinguish the α-amino acids or the sugars of the Calvin Cycle can attest). This section establishes that this apparent limitation of metabolism to a few chemical types is a real restriction in chemical space.

The chemical space from which metabolism is selected is the space of chemicals made from C, N, O, and H, with S as S(II) and P(V), bonded in ways that are found in biological molecules. For example, 2-amino-4-hydroxyhexanoate looks like a plausible amino acids, but it happens not be made by life,Footnote 1 whereas ACCA (Fig. 1) does not fit an intuitive feeling of what a biochemical looks like, as few biological compounds contain a cyclobutane ring. Of the myriad compounds that can be formed from the elements C, N, O, P and S (and H), life rarely forms hydrazines, peroxides, rings of less than five atoms, or phosphorus compounds other than phosphates.

Fig. 1
figure 1

Examples of non-metabolites

These rules, and some others relating to molecular stability, were implemented in the program Combimol as previously described [5]. The chemical space of possible biochemicals includes structures usually excluded from drug design due to their sensitivity to metabolism [6]. From the chemical space of all such molecules, all Fragments were generated as described in [7, 8]. This provides a library of Substructures with which to probe the space of actual molecules that make up metabolism.

Life’s metabolic diversity is enormous [9]. For the purposes of this paper, I use a small subset of metabolites that are components of the central metabolic processes of all life on Earth, and pragmatically those processes that are shown on the Roche/Expasy metabolic map [10]. All the small molecules listed in Part 1 (“Metabolic Pathways”) were used as a set of metabolites here called “core metabolism”, a collection of 611 molecules widely used by all life on Earth (some steroid hormones were not used, as they are chemically very similar and so contribute no new chemical structural types to the data set).

There are more Fragments of 5, 6 and 7 atoms than there are metabolites in core metabolism, so we would expect that some of them would not be represented in that metabolism. The chances that a 5-atom Fragment will be a substructure of a molecule depends on the size of the molecule. Figure 2a shows the expected fraction of those Fragments that would not be found in a set of 611 molecules if the molecules were constructed randomly from the atoms and bonds found in core metabolism. (The algorithm used to estimate the frequency with which a Fragment will match a molecule selected at random from the space of chemicals is described in more detail in Appendix 1.) Fig. 2a shows that the expected number of Fragments that are not found is substantially smaller than the actual number: core metabolism must represent a small subset of the chemical space of possible water-stable chemicals that can be made from C, N, O S(II) and P (V). Figure 2b extends this analysis to the ~45,000 natural product chemicals in the Dictionary of Natural Products (DNP - [11]). DNP records the detection and structural analysis of organic chemicals from any natural source, and so samples the full diversity of chemistry of terrestrial life. If terrestrial biochemistry sampled all the chemical space of CHON, S(II) and P(V) chemistry, then essentially all of the Fragments searched here should be represented in the database. However over 50 % of 6-atom Fragments are not found in the database.

Fig. 2
figure 2

Extent of biochemical space. Fraction (Y axis) of Fragments derived from the space of all possible chemicals that are not found in actual metabolites, compared to the fraction that would be expected not to be found in the same number of chemicals sampled at random from the chemical space of possible metabolites, plotted as a function of fragment size (N – X axis). Blue squares – fraction of fragments not found in actual metabolites. Red circles – fraction not found in an equivalent size collection of random molecules. Panel a: fragments not found in the ‘core metabolism’ of 611 molecules represented in the ExPasy metabolic map. Panel b: Fragments not found in the ~45,000 unique molecules listed in the Dictionary of Natural Products [11]

Why is biochemistry apparently limited to a subset of the possible chemistry that life could perform? The next three sections demonstrates that chemicals that fall outside the chemical space occupied by biochemistry are not merely unlikely to be part of a metabolism, but interfere with that metabolism in a way as to produce a toxic effect, and the further outside ‘biochemical space’ they are, the greater that antagonism.

Mild toxicity is correlated with ‘UnBiological’ chemical characteristics

In this section I introduce a measure of how different a chemical is from the chemical space of life. I show that a greater difference is correlated with low levels of non-specific toxicity. Toxicity is related to the existence of structures in the test chemical that are different from chemical structures usually found in biology.

Chemicals can be toxic for one of three broad reasons.

Toxic chemicals can be chemically reactive, such as formaldehyde or mercury compounds, and so chemically modify the components of life. Reactive toxicity depends on specific chemical functionality. The Combimol chemical generation software automatically excludes reactive moieties, and so this is not a class of toxicity probed by these studies.

Toxic chemicals can interact with a specific molecular mechanism in the organism, and so disrupt a particular biochemical function (see discussion in [1217]). Drugs and plant secondary metabolite toxins achieve their effect in this way. This is caused by very specific chemical structures, which confer specific toxicity on molecules that contain them. Such ‘structural alerts’ were originally identified to predict mutagenicity [18], but have been extended to more general toxicity prediction in programs such as DEREK [19], TOPKAT, MULTICASE [20] and others [21]. In my terms, a ‘structural alert’ is a Fragment that has a high affinity for a specific molecular target whose blockade produces a toxic effect. In agreement with this, ‘structural alert’ approaches to toxicity prediction or other structure-activity relationship methods that try to relate large structural features to biological endpoints work well for specific toxicity mechanisms, such as HERG blockade giving rise to cardiac toxicity [7] or electrophilic attack on DNA giving rise to carcinogenicity [18].

Structural alert approaches do not work well for predicting broad toxicity endpoints, such as death [22]. A wide range of industrial chemicals have, or are claimed to have, toxicity that is not severe or life-threatening at low concentrations, and which is not obviously linked to structural alerts, but which nevertheless cause morbidity and mortality in model organisms at higher concentrations. Interactions of some of these chemicals with various receptors or enzymes is claimed, but most are simply observed to disable or kill model species without a mechanism for their toxicity being known or postulated. It is this third class of low potency, non-specific toxicity that I have probed further below.

I use a Fragment-based approach to identify the largest part of a molecule which is different from anything found in biology. Fragment-based methods of describing molecules are well known, computationally simple approaches to describing a molecule in terms of how its structure would be drawn by a chemist [23]. Several groups have described using a fragment-based approach for molecular description and design [2426], claiming that building drug-like molecules from chemical fragments derived from biochemicals lead to more ‘drug-like’ results.

I define a measure of the fraction of a molecule that is not similar to a biological molecule, which I call “UnBiological” (Ub). Ub is defined as the largest region of a test molecule that does not overlap with at least one molecule in the core metabolite set. Ub has to be defined in terms of the size of the Fragments used to determine overlaps. Because the difference between the expected and observed occupancy of chemical space shown in Fig. 2 is greatest for 5-atom and 6-atom Fragments, 5-atom and 6-atom overlaps were both used for this study, designated as Ub 5 and Ub 6 respectively. The algorithm used to generate the Ub measures is summarised in the Methods section, with more detail on the actual computational steps used in Appendix 2, and a graphic summary of the process in Fig. 3.

Fig. 3
figure 3

Cartoon of calculating UnBiological (Ub). This takes a ‘toy domain’ of four metabolites and three target molecules to explain the process. Only fragments of 3 or 4 atoms are considered in this example. In reality there are 611 metabolites, ~5000 targets molecules (note that a number of molecules are tested in more than one experimental series in Table 3) and 30912 Fragments of size 3 to 14 atoms. Metabolites and target molecules are used to generate Fragments that are present in at least two of the overall set of molecules (This is a convenient limitation on the number of Fragments, and may be revised in future implementations of the algorithm). Fragments are classified as to whether they occur in the set of metabolites (green) or do not occur in metabolites (red). The target set of molecules is then matched to the set of Fragments that do not occur in metabolites – the size of the largest such Fragment is the Ub measure. Note that in this simplified model it is clear that the presence of a chlorine atom confers ‘UnBiological-ness’ on a molecule. The size of the Ub fragment can be the same as the size of the whole molecule (e.g. 1-cloropropane in this example). As illustrated here, this approach takes no account of the potential reactivity of a molecule, only its topological structure

In Table 1 I show the result of correlating the values of Ub calculated for each chemical species with the toxicity endpoints measured for that chemical species, for a variety of measures of general, non-specific toxicity. As is standard in toxicology, the measure of toxicity is the logarithm of the concentration that gives a half-maximal toxic effect in the system under consideration. Log (concentration) scales are commonly used in biochemistry because of the linear relationship between the binding energy of a small molecule binding to a large one and the logarithm of the equilibrium constant of that binding. Pragmatically, a log scale also enables visualization of data spanning many orders of magnitude. The half-maximal affect (EC50 or LD50) is the commonly reported value for many toxicological and pharmacological measures. In the case where the effect is causes by simple binding to a single target, a half-maximal effect represents the concentration at which the target is 50 % occupied, i.e. the Kd.

Table 1 Correlations of Ub with toxicity endpoints

Here the Ub measure (i.e. how much of a molecule does not match a structure found in core metabolism) is negatively correlated with the logarithm of the concentration at which a chemical has a half-maximal effect. This might be LD50 for a lethal toxicity measure or EC50 for a non-lethal measure. A negative value of the correlation means that a larger Ub is associated with a lower concentration, i.e. with a more potent toxin.

In almost all cases, for Ub 5 and Ub 6 , there is a significant negative correlation between Ub and toxic concentration. For a wide range of living systems, from isolated mammalian cells through unicellular plants and protests to multicellular plants and diverse animal species, Ub is correlated with toxicity. This correlation is highly statistically significant. The “***” level of significance in Table 1 is an indication of p < 0.000714 that the indicated correlation will be produced by chance. There were around 70 correlations performed for this initial analysis of the data (35 data sets, including two not shown sub-dividing the rat and mouse data into pharmacological categories, which had little effect, correlated with 2 Ub endpoints). If Ub was uncorrelated with toxicity, there is only a p = 0.05 chance that we would observe one “***” level correlation in this data set.

The one exception to the pattern of correlation of Ub with toxicity is Saccharomyces cereviseae, which shows only weak correlation between the toxicity endpoints reported here and Ub5 or Ub6. A possible reason for this will be discussed below in the section on thresholds for correlations.

I emphasize what this does and does not show. The correlations show robustly that molecules with segments that are not represented in the chemicals of core metabolism have a higher chance of being toxic at any given concentration level than molecules made up of structures found in core metabolism. The larger that “UnBiological” segment is, the more toxic the molecule is. None of the molecules tested for toxicity here are normal components of central metabolism (arguably with the exception of ethanol).

However this is not a method for detecting or predicting pharmacology mediated by a single, known target, or for detecting or predicting toxicity based on a single mechanism. The effects being detected here are relatively non-specific: while many of the toxins are known to interact with proteins, they typically interact with many proteins, and toxic effects often cannot be attributed to a specific molecular interaction. This is illustrated by the exploratory analysis in Table 2. Table 2 shows the result of correlating UnBiological with three conventional toxicity endpoints and two pharmacological ones. HERG toxicity is a significant risk factor for cardiac toxicity in pre-clinical drug candidates, and is detected by screening for blockade of the HERG ion channel in cells [7, 27]. Oestrogenic potential is a common ecotoxicological toxicity measure, and is measured here by binding to the oestrogen receptor [28]. Tadpole narcosis is a whole organism measure of both Central Nervous System penetration and effect on a select set of neurotransmitter receptors [29]. All three are therefore mechanism-based measures of toxicity, and all three show weaker correlations with Ub 6 and no correlation with Ub 5 . An initial statistical analysis of the distribution of Ub5 and Ub6 in the molecules used for the analyses in Tables 3 and 4 (see Appendix 3) suggests that the molecules analysed for Tadpole Narcosis may be atypical of the other sets in the study, and so the lack of correlation found between Ub5 and Tadpole narcosis may be a result of an unrepresentative set of chemicals. The other sets of chemicals whose analysis is summarized in Table 4 appear similar in overall Ub5 and Ub6 properties to those whose analysis is summarized in Table 3.

Table 2 Correlation of Ub with other biological endpoints
Table 3 Biological datasets
Table 4 Structures sets used for docking

Drugs can also be toxic in the wrong place or dose. In Table 2 I also show that Ub is uncorrelated with the potency of compounds for just two drug targets (for which I had data to hand), antihistamines and NSAIDs that inhibit cycloxygenase-2. Neither show any significant correlation with Ub.

I do not claim that other toxicity or pharmacology endpoints will not be found to correlate with UnBiological. The examples in Table 2 are included to make the point that Ub is correlated with broad, whole-organism toxicity, not necessarily with target-specific mechanisms.

Reasons for variability of correlation

Table 1 provides robust statistical evidence for believing that UnBiological is correlated with whole organism toxicity. However the degree of correlation varies substantially between species, as does the statistical significance of that correlation. This could be due to genuine biological differences, or differences in the chemical space being sampled. The issue of chemical space coverage is significant. For example, an initial study suggested a strong correlation of Ub with the potency of phosphodiesterase-4b inhibitors (data not shown). However this was based on analysis of the data in two QSAR studies on PDE4b inhibition. The chemicals in the two studies were very similar to each other (i.e. were two specific series of chemicals). In effect, Ub was being used to classify compounds into the two studies, one of which was developing a much more potent drug series than the other. Therefore Ub could identify more potent PDE4b inhibitors, but for the trivial reason that it was identifying two studies looking at two classes of chemicals. When a wider set of PDE4 inhibitors was analysed, the correlation was reduced.Footnote 2 It seems likely that, as with other QSAR methods, UnBiological will work best on a chemical set spread uniformly across the chemical space that is to be analysed. Bias in the molecules that happen to have been investigated to generate the data analysed here may be a cause of the differences in correlation of Ub and toxicity. This can only be addressed by collecting a more systematic set of multi-species toxicity data on defined chemicals. Data filed for the REACH legislation [30] may provide such a data set in the future.

Statistical significance is a function of sample size. It is not practical to collect hundreds of toxicity endpoints from all the species involved, and not desirable to discard endpoints from species that have been extensively tested. Therefore this aspect of variability has been retained in the study.

Threshold for correlations is millimolar concentration

Many of the correlations summarised in Table 1 are statistically robust but relatively small. Direct plots of Ub vs. toxicity are usually uninformative. A good and strong statistical correlation can be shown for data that does not appear ‘correlated’ to the eye – this is the reason for performing correlation calculations. However, for some of the more strongly correlated data sets containing relatively few data points, such as those plotted in Fig. 4, the correlation between Ub and toxicity appears stronger for weakly potent toxins than for highly potent ones (i.e. the correlation is clearer on the right-hand side of the graphs, and weak or non-existent on the left-hand side). This has also been observed for the correlation of structural redox with toxicity [4].

Fig. 4
figure 4

UnBiological vs. toxicity for selected organisms. Plots of UnBiological vs. toxicity endpoints for three of the data sets analysed here. Each dot represents the LD50 (X axis) vs Ub value (Y axis) for one compound. Ub is calculated as described in the Methods section and Appendix 2. In summary, Ub represents the largest region on a molecule that is not present in a metabolite, as defined by a 5-atom (Ub 5 ) or 6-atom (Ub 6 ) overlap. a: Ub 5 vs. LD50 for Chlorella, b: Ub6 vs. LD50 for Rainbow trout, c: Ub 6 vs. LD50 for Lemna, intoxicated with compounds other than herbicides

For some of the data sets there are a sufficiently large number of data points to split the data into potency bands and correlate these independently with UnBiological. The results from this analysis are shown in Fig. 5. For consistency, toxicity data was binned into bands of round number log units of LD50 or EC50, which results in different numbers of data points in each bin, and hence different levels of significance for the resulting correlations.

Fig. 5
figure 5

Correlation of UnBiological with toxicity by potency band. Correlation of Ub 5 and Ub 6 with different toxicity endpoints. For each data set, the data on a compound was binned for compounds having different EC50 or LD50 values, and the correlation of the toxicity endpoint with Ub was correlated with the toxicity values within that concentration range. Thus for Rat oral LD50 (Panel a), toxicity was binned into Log(LD50) < −4, Log(LD50) between −4 and −3, Log(LD50) between −3 and −2, and Log(Ld50) > −2, all values in molar. Correlations were calculated for each of these four data sets. Error bars are 95 % confidence limits for the correlation, based on the number of data points in each bin. For all panels: X axis = concentration bins, in log (molar). Y axis: correlation of Ub and toxicity within that data sub-set. Panels a: to f: − Rat oral toxicity, mouse oral toxicity, rat carcinogenic potential (from CPDBAS), NCI cell line cytotoxicity, Fathead minnow toxicity and tetrahymena toxicity

Figure 5 shows trends in most of the larger data sets that lower potency toxins have better correlation with Ub than higher potency toxins. For the NCI cytotoxicity and Fathead Minnow data (Fig. 5d and e respectively) there is little trend for UB6, although for UB5 only the highest concentration data (−3 to 0 band) shows a statistically significant correlation (i.e. the certainty range is below 0). For all other data sets, both UB5 and UB6 show negative correlation of Ub with concentration (i.e. confidence limits are <0) only for the highest concentration band. Figure 5 confirms, for these data sets, that the correlation of UnBiological with toxicity is an effect seen primarily in compounds that have low intrinsic toxicity.

This observation may explain the failure to observe a correlation of Ub with toxicity in Saccharomyces. In the data set analysed here, Saccharomyces was tested for the effects of chemicals at six concentrations from 1.3 uM to 100 uM. Thus no chemical with an IC50 of >100 uM could be detected in this screen, and so the IC50 values analysed here are all below the threshold at which a statistically robust correlation of Ub and toxicity would be expected.

Mechanism of Ub correlation with toxicity

The observation that a simple and non-specific measure of chemical structure like Ub might be correlated with toxicity is unexpected. The observation that the correlation is more pronounced for weak toxins is, on the face of it, baffling. QSAR measures of biological potency are usually more effective for the most potent agents – whether toxins, drugs, hormones or other effectors. The findings in Figs. 4 and 5 therefore require a mechanistic explanation for the correlation of Ub with toxicity that operates at millimolar but not micromolar concentrations. This second part of the paper, and the results shown in it, address the plausibility of a potential mechanism.

The mechanism I propose here is that many, probably most chemicals will interact with some, maybe many, components of the cell at millimolar affinity.

The distinction of small molecules into ‘ligands’ and ‘non-ligands’ is a convenient classification for small molecules with respect to their effects on a specific protein, but it is a fiction not reflected in chemical reality. It suggests an absolute distinction between ‘binding sites’ and ‘non-binding’ sites. In reality, a small molecule can interact with atoms across the surface of a protein, and can often bind to proteins in more than one conformation and at more than one site (discussed further below). Only those sites which are unique and bind molecules with much higher affinity than any other site are called ‘binding sites’. The reality of the other sites that can, and do, interact weakly with small molecules is however illustrated by experimental evolution studies, where new protein functions are typically created by selecting new modes of interaction between protein and ligand from weak interactions already present in the original protein [31].

There is a substantial body of literature that suggests that many, maybe most small molecules can interact with many, possibly most proteins at millimolar concentration. I summarise three lines of such evidence below. This data will be very familiar to those involved in pharmaceutical screening programmes.

High-throughput screen data

High-throughput screening (HTS) is a common route to discovering novel biological function in large libraries of chemicals. In an HTS campaign, a very large library of chemicals (tens to hundreds of thousands) is tested at one or a few concentrations in an entirely automated assay designed to give a simple, semi-quantitative measure of whether a chemical interacts with a specific molecular target. Compounds that reach a threshold of activity (“hits”) are then taken on for further study. Such large screening programmes are a common approach to drug discovery [32].

Typical reports of such screens report a ‘hit rate’ of between 0.1 and 1 %, and report finding ‘hits’ that bind to the target protein with micromolar affinities [32, 33]. Most freely available databases of the biological effects of molecules also assume that compounds either bind to a target with micromolar affinity or better, or that they do not [34]. Such databases imply that ‘not binding’ at the tested concentration means not binding at all. However this literature is misleading. More detailed reports of HTS campaigns routinely report widespread “non-specific” interaction of small molecules with protein or cellular targets (see e.g. [3538]). Assay conditions, screening concentrations, detection thresholds and other factors are tuned to achieve a hit rate of <0.1 % in what in reality is a continuum of binding.

Usually the raw data behind an HTS screening programme is not available – only summary statistics and the data on the ‘hits’ is published. However the reality of HTS binding can be illustrated with HTS data available from the National Cancer Institute, which has published detailed screening data on a library of ~47,000 compounds for anti-HIV effect and ~60,000 compounds for anti-cancer effect [39, 40]. Figure 6 summarises this data in terms of the chance that a compound will be found to have a positive effect on a screen at a particular concentration. Not all compounds are tested at higher concentrations, so Fig. 6 plots the fraction of compounds that have an effect at a concentration as a fraction of the compounds tested at that concentration. The result is clear. There is a continuum of affinity in this essentially random set of chemicals for their molecular targets, and while the chance that a compound has an effect at micromolar concentration is low, the chance that it has a biological effect on mammalian cells approaches 1 as the concentration approaches 10 mM.

Fig. 6
figure 6

NCI screening data analysis. X axis: concentration. Y axis: fraction of compounds in NCI public datasets on cell-based screens that show inhibitory effect in that assay as a fraction of number of compounds tested at that concentration. Results are binned into concentration bins on a log scale, each bin representing log (concentration) = 0.25. Blue diamonds: HIV screening data [39]. Red squares: cell line screening for anti-cancer effect [40]

Fragment-based screening

Fragment-based screening (FBS) seeks to identify small molecules that bind with relatively low affinity to proteins, and then combine these into larger molecules that bind with greater affinity (reviewed in [4145]). FBS actively looks for high micromolar or low millimolar affinity of small molecules to proteins. It is a commonplace for researchers in this field that many small molecules (“fragments”) bind to most proteins at low millimolar concentrations. For example, [43] comment that “Novice users [of fragment-based screening by BiaCORE] are often surprised to see how often small molecules bind indiscriminately to proteins when compounds are assayed at high concentrations”. The data they give suggests ~75 % of a 1000 compound subset of the Maybridge Ro3 library bound equally well to two different targets at high micromolar to low millimolar concentrations. Congreve et al. [46] find that 90 % of compounds in their library have some binding at mM affinities. Hubbard [41] found a ‘hit’ rate of between 1.5 and 4.7 % when measuring small molecule binding to proteins at 0.5 mM by NMR. Giannetti [47] reviewed 20 different fragment-based screens, and report that all show ‘non-specific’ binding at affinities of 1 – 4 mM, although the highest affinities found ranged over three orders of magnitude in the different experiments. Spurlino [48] found that between 5 and 50 % of a library bound to target protein crystals at 5 mM (depending on library/target combination).

Other Non-Specific Binding observations

Non-specific interactions are a fact of life for pharmaceutical researchers, even among molecules that are selected for their specificity of action. Even in launched pharmaceuticals, supposedly selected for their singular, specific interaction with one target or target class, multi-target interactions are being recognised as the rule rather than the exception [49, 50]. LaBella commented “The non-specificity of drugs is a generally acknowledged truism” over 20 years ago [50], with genome-scale testing of molecules confirming that nearly all small molecules bind to multiple proteins [51]. Such ‘non-specific effects’ now being accepted as a critical part of drugs’ actions [52, 53]. Houk et al. review a range of studies of binding of small molecules to proteins and cyclodextrin mimics of protein binding sites, and find an average binding affinity of ~0.5 mM [54].

Molecular mechanism of low affinity binding effects

It is worthwhile touching briefly on potential mechanisms of millimolar binding of compounds to proteins, and the likelihood that this will materially affect the protein’s function. Again, we must challenge the conventional model of a ligand binding to a ‘binding site’ on a protein. Structural studies have shown that many proteins can bind a diversity of chemical structures through adaptation of their structure (reviewed in [5557]). Many proteins exist in dynamic equilibrium with partially or completely unfolded structures, some being dominantly disordered [5861]. Post translational modification [62] or ligand binding [63, 64] can switch proteins from a disordered to a more ordered state, switches which can be related to their regulation and function [59, 65]. Proteins can also have multiple ordered, metastable structures (reviewed in [66]), and different folding states can be selected by ligand binding and have significantly different biological function (see for example [6771]). There can also multiple folding paths leading to each of those states (see e.g. [72]).

A molecule that binds even weakly to one folding state of a protein and not to another will bias the population of protein folding states by stabilising the bound state over the others (by definition, if a small molecule M binds to folding state A, then the combination of A + M must be more stable than A and M on their own, and hence more likely to occur). If one of the structures in the spectrum of structures has a function absent from other structures, then binding of the small molecule will change that function by changing the amount of the functional conformer. The binding need not be ‘tight’, and may not even be detectable on the canonical crystal structure for the protein, but will nevertheless affect function in the cell.

In conclusion, it is found in many types of experimental systems that all, or nearly all, small molecules interact with many proteins with low millimolar affinity, and these low affinity bindings can have significant biological affect through modulation of the population of structures adopted by a protein. This observation leads both to an explanation of the mechanism of millimolar toxicity, and to its correlation with UnBiological.

Proposed mechanism of correlation of Ub with millimolar toxicity

Selection against protein binding of metabolites

The observations above that many molecules interact with many cellular targets at millimolar concentration, and that these are likely to have significant biological effects, raises an obvious question. If many molecules can interact with many proteins at low millimolar levels, and such interaction has adverse effects on the cell, and many metabolites are present in the cell at low millimolar concentration, then why does the cell not poison itself with its own metabolites?

A plausible explanation is that the proteins (and other large molecular constituents of the cell) have evolved to avoid interference from the cell’s normal constituents. A protein that needs to interact with (say) an α-amino acid will evolve a binding site for that α-amino acid. A protein that does not require interaction with an α-amino acid for its function may nevertheless have a low affinity binding site for an α-amino acid in one of its conformers by chance. If this low affinity binding site has an adverse effect on the cell, then it will be selected against. In short, any non-specific interaction of the cell’s normal constituents will be selected against just as there will be positive selection for beneficial interactions.

Thus we would expect any binding site or pocket on a protein that could bind an amino acid to be selected against unless that interaction provided a beneficial effect on the function of the protein. Any compound that ‘looked like’ an amino acid (i.e. had similar chemical groups arranged similarly in space) would therefore also not find binding sites on that protein. Similarly there would be selection against random or fortuitous binding sites for the chemical features present in sugars, lipids, phosphate esters and other common structures in metabolism. However there would be no selection against low affinity, random binding to flurocarbons, organosilicon compounds or other chemicals quite different from anything normally in a cell. These, therefore, would be free to bind to any cellular protein if, by chance, a binding site happened to exist for them. The larger the segment of the xenobiotic that was unlike the chemistry of life, the greater the potential affinity for such a non-canonical binding site. The association of UnBiological with toxicity shown in Table 1 is therefore a consequence of the failure of biochemistry to be selected to avoid random binding of chemicals that the cell does not usually encounter.

Testing the hypothesis with molecular docking

Such a hypothesis has not been tested experimentally as far as I know, except in so far as low affinity binding of small molecules to proteins is commonly observed as noted above, although it has been observed that D-amino acids are mildly toxic to a wide range of microorganisms compared to their L-enantiomers [73]. In principle the low millimolar binding of small ligands to proteins could be tested computationally using molecular docking software, by trying to dock molecules known to not be ligands for a protein to that protein. Large-scale protein docking exercises do show that the majority of small molecules dock to target proteins with low millimolar or high micromolar affinity (for example [7476]). Unfortunately, the low-affinity predictions of these exercises are unreliable (which is why they are usually ignored). As we do not know where the ‘binding site’ for a non-ligand might be, the test non-ligand must be docked to the whole protein, This provides such a large number of potential interactions that the software cannot reliably discriminate actual likely binding sites from implausible ones. Figure 7a illustrates this, docking 56 drugs with the ABL receptor. There is a strong trend for larger molecules to be predicted to have a higher affinity for ABL (which is plausible), but the largest molecules are predicted to bind as tightly as some bona fide inhibitors, despite having no known inhibitory effect on the enzyme.

Fig. 7
figure 7

Docking small molecules with entire protein structures. a. Binding of 18 known ABL inhibitors, compared to the binding of 56 drugs or natural products not reported to have any effect on ABL kinase activity. Y axis: Vina output binding energy. X axis: molecular weight. b. Comparison of the predicted binding energy of 15 alpha amino acid and their alpha-N methyl alpha-carboxymethyl derivatives with the binding energy of equivalent beta amino acids and amino acid derivatives to ABL, Aldolase, HIV protease, PDE2b4b and PPAR gamma structures.. Excluded amino acids were: Glycine, which has no beta amino acid, beta alanine which is a metabolite in its own right and so was excluded, beta aspartate and asparagine which are the same as alpha aspartate and asparagines, and beta threonine which is likely to be unstable and so not a realistic chemical structure. Error bars are 95 % confidence limits (1.98*standard error of the mean)

If we confine ourselves to comparing molecules of the same size and atomic constitution, then some of the artefactual results shown in Fig. 7a might be avoided. Figure 7b shows the comparison of the predicted energies of binding of α-amino acids compared to β-amino acids to five mammalian proteins, selected to represent a mix of functional classes of proteins for which multiple structures and many authentic ligands were known. α-amino acids are core metabolites in mammals, β-amino acids are not part of normal mammalian metabolism with a couple of exceptions. For some but not all proteins tested, α-amino acids are predicted to bind with lower affinity than β-amino acids, as predicted by the hypothesis. The exceptions found here are HIV protease (which would be expected to bind amino acids, as they are related to its substrate) and PPAR-γ (for which I have no explanation). Repeating this exercise with more sophisticated models that took the dynamics of proteins as well as ligands into account (e.g. [77, 78]) might produce more useful results.

Conflict with pharmaceutical experience

The suggestion that molecules that are not like biological molecules are more likely to be toxic appears paradoxical to the pharmaceutical chemist, as many drugs have potent (and hence potentially toxic) effects precisely because they are close molecular mimics of known metabolites. Thus, steroid drugs are potent precisely because they mimic natural steroids, dideoxyribonucleotides block viral DNA synthesis because of their mimicry of normal nucleosides [79, 80], penicillins mimic peptidoglycan components [81], and so on. However these molecules have been selected by evolution or by chemists to both mimic a specific biological effector and not to have any other effects than their target pharmacology. It is a truism of drug discovery that achieving this combination is extremely hard, and that unexpected or ‘off-target’ effects are a common cause of failure in drug discovery and development programmes [8284]. Some of these effects are due to the close structural similarity between members of families of proteins, so that a drug selected to bind with high affinity to one target will be likely to bind to another, structurally similar target. However other ‘off-target’ effects are not obviously related to the known structural similarity of the ‘off-target’ proteins [85]. Yamanishi et al. [86] suggest that this is because small regions (equivalent to the Fragments used in my analysis) confer protein binding. A substantial fraction of the effort in drug discovery programmes is tailoring the specificity of the candidate drug to bind to a small number of targets, and many launched drugs actually bind to more than one protein family [8790]. Drugs are therefore a special case, the result of extensive selection by man to fit with biology. The same explanation is true for the observation that chemicals that are not metabolites but fall within ‘Biochemical Space’ have a higher chance of being toxic even in the absence of selected pharmacology [4].

Detoxification and resistance

A second apparent conflict with pharmaceutical experience is that organisms can and do tolerate a wide range of compounds that are toxic through tolerance, detoxification, and resistance mechanisms. The first two of these are less important to my general thesis than they might appear, and the third actually supports it.

Tolerance to a toxin or drug is almost invariably caused by changes in the organism’s physiology to compensate for the action of the drug or toxin. This is classically true of pharmacological agents such as alcohol, nicotine or heroin, but also to classic toxins such as arsenic. Mechanisms that oppose the effect of the drug or toxin are induced to restore a more normal physiological state. This is unrelated to the mechanism of intoxication in the first place.

Detoxification is a broad approach to removing toxins from an organism. It usually relies on enzymes (such as the CYPs in mammalian liver [91]) or transporters (such as the PGP family [92]) with very broad substrate specificities. It can also involve physical separation of the toxin into a defensive cell compartment. Compartmentalization is a common strategy for cells to sequester damaging metabolic chemistry from cell components that that chemistry might damage (e.g. oxidative phosphorylation in eukaryotes, anamox in prokaryotes). Sequestering misfolded proteins, damaged cell components or toxins can be seen as a form of ‘internal exile’, analogous to the export of these materials.

Acquisition of resistance be through one of two broad mechanisms. Detoxification mechanisms can be increased, often by mutation that increases expression of the relevant protein – this is a common mechanism of acquisition of drug resistance in cancer cells and in bacteria ([92, 93]. The other mechanism is for targets of the toxin to mutate so that they no longer bind the toxin. This is common for drug resistance [9496]. It is not expected to apply to low potency, low molecular weight toxins, because (following my argument above) many, possibly most proteins would have to be mutated to evade toxicity. The mechanism of organisms’ resistance to chemicals other than drugs has not been reviewed systematically, so we do not know if this prediction is true.

Conclusion

I have shown above that molecules that contain segments that are not similar to common components of metabolism are more likely to show toxicity at millimolar levels than compounds that have chemistry similar to life. I relate this to the widespread observation that many chemicals bind to many proteins at low millimolar levels, and that this can materially affect the function of those proteins.

This observation is an explanation for the observation that the chemistry of life occupies a small corner of the chemical space. In order to function, the components of the cell must interact with each other appropriately, both with functional interactions between the macro-molecules and metabolites of the cell and with the absence of unwanted interactions. Each new chemical added to metabolism requires adaptation of the whole proteome to accommodate the new chemical. Once a complex, self-perpetuating metabolism has evolved, adding to it will be an increasingly demanding evolutionary task, not an impossible one but one that the pragmatic mechanisms of evolution will tend to avoid.

This finding has two implications. Firstly, UnBiological could be used as a measure of the chance that a new molecule is toxic. Such broad toxicity predictions are less useful than predictions of specific mechanisms of toxicity, and UnBiological specifically does not provide a mechanistic explanation. It is also only as statistical estimate. From the data analysed here, Ub 5 or Ub 6 could be used to give an order-of-magnitude estimate of the potency of a low-potency toxin, but would say nothing about high potency toxicity. It is possible that coupling Ub with other measures [4] might give more accurate estimates. As an initial screen for ‘drug-like’ properties [97100], however, such a statistical indicator could find a use.

In this application of predicting toxicity, a strong limitation of the analysis presented here is that it takes no account of the concentration of metabolites in the cell. Metabolic intermediates present at nanomolar concentration are given the same weight in the Ub calculations above as common components such as glucose or glycine. One would however expect the selective pressure on proteins to avoid binding glucose to be much stronger than the pressure to avoid binding metabolites present at nanomolar concentrations.

An extension of this work would therefore include a concentration term in the calculation of UnBiological. This would include two components – consideration of the differing metabolomes of different cells or organisms, and quantitative consideration of the concentration of metabolites in an organism. In this study, a single collection of metabolites (“core metabolism”) was used to define UnBiological. I expect that predictions of toxicity based on the actual intracellular metabolome of a specific species would be more accurate for that species (and less accurate for other species) than this generic approach. This is however a substantial undertaking, involving re-calculation of most of the comparisons presented here for each species, and so has not been attempted in this paper: my goal here is to show that this approach is theoretically and practically interesting. It might also be valuable to weight the contributions of metabolies to the Ub calculation according to their intracellular concentration, although this is fraught with difficulty as intracellular concentrations of metabolites are very hard to measure, and in any case are modulated by the protein binding that this study postulates occurs promiscuously and universally. Future work could also explore the size of the overlap necessary to define Ub: again, this would be doable, but time-consuming, and so has been left for future work.

The second implication of this work is in the field of metabolic engineering and synthetic biology. Engineering an organism to produce a new chemical or execute a new metabolic pathway has been thought to require the expression of suitable enzymes to make the chemical and any intermediate or precursor molecules at sufficient concentration, efficiency, and from suitable feedstock. The rest of the cellular machinery is generally viewed as a ‘chassis’ on which to attach these changes [101103]. For chemicals or gene products produced at low concentrations this is likely to be true [104]. However if the goal of the engineering is to produce a chemical at substantial levels [105], then the analysis in this paper suggests that many aspects of the cell must be engineered, especially if the chemical to be produced is very different from one usually present in the cell.

Methods

Toxicity data

Databases of molecular structures and biological endpoints were collected from literature sources, as listed in Table 3. Data was filtered to collect toxicity endpoints that were, as far as practical, the same endpoint for different studies on the same organism. Data sets were collected that

  • provided a quantitative half-effect concentration estimates (i.e. not single concentration toxicities)

  • provided data on at least 50 compounds from diverse chemical families

  • were available from a small number of sources (for practical reasons, data sets of 200 compounds studied in 200 papers were not used)

  • were from species with recognised use in toxicity testing.

All EC50 values were as reported in the relevant papers or databases except those for Saccharomyces cereviseae, where EC50 values were calculated from the raw inhibition data downloaded from [106]. The Saccharomyces data set was filtered to exclude organometallic compounds, to exclude mixtures or salts other than halogen or alkali metal salts, compounds for which growth inhibition at the highest concentration was <50 % or for which the growth inhibition at the lowest concentration was >50 %, and compounds for which the range of calculated EC50s across the 13 strains tested in this data set (calculated as [maximum(EC50)-minimum(EC50)]/average (EC50)) was >1. The resulting data set represented well defined organic compounds with EC50s within the experimentally measured concentrations and consistent toxicity across a range of Saccharomyces cereviseae strains.

Chemical structures were collected as. MOL files, and compiled into an SDF file for processing.

Molecules of metabolism

The chemical space of metabolism was taken as all the molecules shown on the printed version of Part A (intermediary metabolism) of the Roche/Expasy metabolic chart, with the exception of the steroid hormones. Steroid hormones were omitted because they represent many elaborations on the same core (sterol) structure, and do not add significantly to the diversity of chemical types. The chemical space of 611 metabolites is referred to as ‘core metabolism’ in this paper.

Generation of ‘all’ molecules in chemical space

The space of all possible chemical structures was explored by the program Combimol [5]. In brief, the program generates chemical structures based on SMILES strings [107]. The program aims to generate all the molecules of a specified size (defined as number of non-hydrogen atoms), using a subset of C, N, O, S, P or Si that is specified by the user. ‘All molecules’ are here defined as all 2-D chemical structures that have a bonding pattern consistent with the valencies of the atoms used: four bonds for C and Si, 3 for N, 2 for O, 3 or 5 for P, 2, 4 or 6 for S. For the work described in this paper, silicon was excluded from this list, and only P(V) and S(II) and S(VI) (sulphate) were used, consistent with those elements’ use in metabolism.

The SMILES language is a simple, text-based method for coding chemical structures [107]. The program starts by generating an exhaustive list of all possible linear atom strings up to the desired size: if the maximum size was 4, it would generate CC, CCC, CCCC, CCCN, CCCO, CCCP, CCNC etc.. It then replaces single bonds with double bonds in any position allowed by the elements (CCC = C, CC = CC etc.). It then generates cyclised versions of these strings (C1CCC1, where the two ‘1’ symbols represent atoms that are connected – the reader is directed to [107] for a further description of the SMILES language). A molecule can have up to two ring systems, including fused rings. These form unbranched ‘core’ molecules. In a final step, the ‘core’ molecules are joined to each other to form branched molecules: thus CCC and CC could be joined to form CC(C)C.

A number of rules are included in this process to remove atom combinations that would be unstable (such as COOOC).

As described in [5] the program is not completely exhaustive (it under-represents quaternary carbon centres, for example): however I estimate that it generates over 90 % of the molecular structures that are consistent with the rules set provided. The program, and a new, more systematic version, which was not used in this work but will be for future work, is available for non-commercial applications from the author.

Molecular matching and similarity

I define here the terms for molecular structures and matching used in this paper. These are not significantly different from how these terms are used in the general chemical literature, but are laid out explicitly here to avoid confusion.

  • A molecule is a set of atoms connected by bonds, in which all the atoms’ valencies are filled. It is assumed that all valencies that are not explicitly linked to another atom in the description of the molecule are filled with hydrogen atoms. (“molecule” here is completely consistent with the common understanding of the term, and so will not be capitalised: I define it here solely for completeness).

  • A Fragment is a set of atoms connected by bonds, in which the valencies of at least one of the atoms are not filled. A Fragment therefore represents part of the structure of a molecule, and not a real physical entity.

  • Substructure. Molecule or Fragment A was said to be a substructure of molecule or Fragment B when all of the non-hydrogen atoms and all the bonds of molecule or Fragment A could be overlaid on molecule B in at least one position.

  • Overlap. Molecule or Fragment A and molecule or Fragment B are said to have an N-atom overlap when the largest Fragment which is a substructure of molecule A and of molecule B has N atoms.

To identify matches and Overlaps between molecules and Fragments, I used a 2D fragment-based molecular descriptor system that has previously been described [7, 8], and proven effective in building models to predict toxicity outcomes. In summary, I generate an exhaustive set of Fragments from all the molecules used in this study as follows. For each pair of molecules, the maximum common structure (Maximum Common Subgraph – MCS) is found by ‘overlapping’ the 2D molecular structure. The MCS between each of these Fragments and between each Fragment and each original molecule is then computed to generate further Fragments. This is repeated until no new Fragments are found. The result is a list of all the molecular Fragments that are present in two or more of the molecules in the set. Fragments of 1 or 2 atoms are ignored.

Molecular descriptors of a molecule are then computed by matching a set of these molecular Fragments to that molecule, and counting the number of distinct ways that a Fragment can be exactly mapped onto a molecule. A descriptor is an integral count of the number of occurrences of a Fragment in a molecule. The molecule as a whole is described by the pattern of Fragment descriptors.

Molecular matching and molecular Fragment generation were performed by software build by Amedis Pharmaceuticals Ltd. (see [7, 8] for details) and kindly provided by Dr. Antranig Basman. Conversion of data files for transfer between programs was done with a number of small programs written specifically for this project in Qbasic, and compiled with the QB64 compiler [108]. All programs other than MolDescrip are available from the author on request, and source code for programs other than those originating from Amedis Pharmaceuticals are also available. General chemical database manipulation was done using the CambridgeSoft ChemBioOffice suite version 12.0, under site licence to MIT. All work was done on standard PCs running various versions of Windows depending on their age.

Docking

Docking was done using AutoDock Vina [109]. The ‘binding site’ was defined as the entire surface of the protein for all the proteins. Potential ligands were docked to a number of structures for each protein, as listed in Table 4. Ligands were docked using default parameters except for ‘Exhaustiveness’, which was set = 100. The binding energy of a ligand to a protein was taken as the maximum (most negative) binding energy of any ligand conformation to any site on any of the tested protein structures.

Calculation of ‘Unbiological (Ub)

‘Unbiological’ is a measure of the size of a region of a molecule that is not represented in metabolism. In this paper ‘metabolism’ is taken to be the set of ‘core metabolism’ molecules defined by the 611 chemicals listed in the Roche/Expasy metabolic map, as described above (section Molecules of metabolism).

What is meant by ‘not represented’ depends on the size of the Fragment that is being considered. If we only require one atom similarity between a metabolite and a test molecule, then clearly almost all molecules can ‘match’ a core metabolite. Thus Ub depends on our definition of similarity.

In this study, I define Ub as follows: Fig. 3 illustrates this process. The metabolites of ‘core metabolism’ are broken into N-atom fragments. A region is ‘unbiological’ if it does not contain an exact match to any of the fragments generated from core metabolism. Thus in Fig. 3, three metabolites (top left) and three test chemicals (top right) generate 12 Fragments, of which 9 (bottom left) completely match the original metabolites. The three Fragments not found in metabolites (bottom right) are UnBiological. These are matched to the target molecules (bottom right). The size of the largest UnBiological fragment that can be matched to a test molecule is its Ub value.

This is a measure of the size of a sub-region of a molecule that has an arrangement of atoms unlike an arrangement found in biology.

As noted in the text, this depends on the original set of biochemicals used as a definition of ‘biochemistry’. The use of central metabolic pathways is convenient, but could be improved.

Ethics and consent

This work involved no human or animal experimentation, and so no ethical or other consent was relevant.

Availability of data and materials

The complete data set of Ub values, toxicity endpoints for the chemicals analysed in this study is available for download from LabArchives (https://mynotebook.labarchives.com/), at DOI 10.6070/H4VQ30PJ (direct URL for download of the spreadsheet https://mynotebook.labarchives.com/share_attachment/Bains_Data/MTkuNXwxNTAzOTUvMTUtMy9UcmVlTm9kZS83NjQwNTkxNDl8NDkuNQ==). A ZIP file of the MOL files for rhe chemicals used in this study can be downloaded from https://mynotebook.labarchives.com/share_attachment/Bains_Data/MTkuNXwxNTAzOTUvMTUtNC9UcmVlTm9kZS8yMzAwNTAxMjU0fDQ5LjU=. The programs used for this specific analysis were proprietary to Amedis Pharmaceuticals Ltd (Cambridge, UK), when that company existed, and are available as compiled code only: consequently they are not available for general use. However similar chemical matching functionality can be found in RDKit, (http://www.rdkit.org/docs/index.html), which is an open-source, Python based platform.