Introduction

Cocrystals are important products of materials science, and many branches of industry take advantage [1] of the possibility of tuning properties of solids [2]. This covers, among other domains, the pharmaceutical [3, 4], agrochemical [5, 6] or high-energy industries [68]. Not all multicomponent solids are classified as cocrystals [9] since at least two criterions must be met [913]. First of all, after cocrystallization, the molded homogeneous phase should comprise stoichiometric proportions of the components. Besides, all coformers should be solids under ambient conditions. The possibility of alteration of the physicochemical properties after successful cocrystallization is especially welcomed in the case of active pharmaceutical ingredients (API). There are many examples of significant improvements of API behaviors both in vivo and in vitro [14, 15] due to enhancement of pharmacokinetic properties as solubility [4, 16, 17] and bioavailability [1820]. Also many other physicochemical properties can be modulated by cocrystallization including stability [2124], hygroscopicity [25] and prolonged shelf life [26]. Among many drugs, aromatic amides acting either as APIs or as excipients attract nowadays substantial attention [2735]. These compounds are known for their important roles in medical applications. For example, vitamin B3 or PP are synonyms for nicotinamide, which is an important compound functioning as a component of the coenzyme NAD [36]. Pyrazinamide with its bacteriostatic and bacteriocidal activities, acting as an efficient antitubercular agent [37], was also recognized as an important medication. Also salicylamide and ethenzamide are known as analgesic and antipyretic drugs [38]. They are used as non-prescription painkillers belonging to nonsteroidal anti-inflammatory agents with medicinal uses similar to those of aspirin. Temozolomide, known under different brand names as Temodar, Temodal or Temcad, is an orally administered alkylating agent used in chemotherapy for treatment of some types of brain cancer and a first-line treatment for glioblastoma multiforme [39, 40].

Majority of aromatic or heteroaromatic amides are poorly soluble in water, and cocrystallization with more soluble formers might be one of the remedies for this limitation [4]. Although many pharmaceutical cocrystals containing amides have been studied [2735], the data deposited in the Cambridge Structural Database (CSD) [41] are rather variable in the sense that many coformers were used for synthesis of diverse cocrystals. For example cocrystals of fumaric acid with benzamide, isonicotinamide or nicotinamide are known under refcodes YOPBUB, LUNNOX and EDAPOQ, respectively, but there is no information about solids of these coformers with temozolomide. There are information about 4-hydroxybenzoic acid and 4-nitrobenzoic acid cocrystallization with isonicotinamide, but cocrystal of nicotinamide is known only with the former compound. There are many such “gaps,” which can be highlighted by retrieving of corresponding data from the latest edition of the CSD. Of course, lack of the structure in the CSD does not necessarily indicate that a specific system has not been studied. There are also cocrystal screening studies that do not report structures but provide information about positive or negative cases. However, experimental verification of all possible combination of pairs of potential conformers is impractical. That is why, theoretical cocrystals screening might offer valuable guiding clues for cocrystal landscape exploration. Many theoretical approaches were developed as suitable for this purpose, but among them those taking advantage of relatively inexpensive scans of potential cocrystallization propensities seem to be worth considering. For example, characteristics of the electrostatic potential surface of the interacting molecules can be used for identification of the most likely contacts [42]. A virtual cocrystal screening method [43] was successfully validated against experimental cocrystals. Besides, the mixing enthalpy of supercooled liquid coformers with a given stoichiometry can be used for screening of cocrystallization potential [4446]. Besides, the supramolecular phenomena expressed in terms of homo- or heterosynthons proved to be valuable guidance for practical applications [4749]. Alternatively the semiquantitative models for predicting cocrystallization probability were also formulated [50, 51] in terms of statistical analysis of molecular descriptors distributions and chemometric analysis.

The aim of this paper is to explore the idea of the applicability of similarities between cocrystallization landscapes of different compounds assuming that propensities of cocrystallization can be modeled by miscibility affinities of components in liquids under supercooled conditions. As a consequence, a significant extension of the list of intermolecular complexes in the solid state is expected, based on existing knowledge of experimentally verified cases. Also systems that do not cocrystallize can be potentially identified. This kind of virtual screening relying on cocrystallization properties of one compound can potentially offer predictions of cocrystals for other similar systems. Identification of such cases and formulation of the suitable rules is the main goal of this paper.

Computation method

Training sets of cocrystals and coformers

The CSD [41] (release 2016) was searched for binary systems comprising aromatic or heteroaromatic amides. The list of conformers was built based on composition of the cocrystals understood accordingly to the most common definition [913]. Hence, systems comprising components in liquid state under ambient conditions were excluded from the analysis, as well as ions, polymers, solvates (including hydrates) and clathrates. Among all 8543 systems embracing two distinct chemical units in the entry, there were found 45 amides involved in 356 cocrystals with 211 distinct coformers. These compounds were used for definition of the first set of coformers. Additionally, the second set was also considered by including compounds appearing on the EAFUS (everything added to the food in US) or GRAS (generally recognized as safe) lists. This second set of coformers comprising 677 neutral and solid species under ambient conditions might be of practical importance since there are many amides which take part in medical formulations, as it was already mentioned. Chemical names of all considered here aromatic and heteroaromatic amides involved in binary cocrystals were collected in Table 1.

Table 1 List of aromatic and heteroaromatic amides involved in binary cocrystals

Mixing enthalpy estimation

The cocrystals screening was performed according to methodology relying on thermodynamic computations of coformers mixing in the liquid state under supercooled conditions. This approach was successfully applied to a variety of systems including cocrystals [4446]. It assumes that the miscibility of supercooled liquids is also associated with miscibility in the solid state. To quantify the affinity of coformers, the excess thermodynamic functions were computed for a mixture composed of two components with given stoichiometric proportions. Particularly, the mixing enthalpy can be defined as follows:

$$\Delta H_{12}^{\text{mix}} = H_{12} - \left( {x_{1} H_{1}^{1} + x_{2} H_{2}^{2} } \right)$$
(1)

where subscripts denote solutes, superscripts represent solvent types and x stands for molar fraction of a given component. The enthalpy of cocrystal formation, H 12, can be estimated based on the computations in binary liquid as follows:

$$H_{12}^{{}} = x_{1} H_{12}^{1} + x_{2} H_{12}^{2}$$
(2)

The excess enthalpy accounts for all energetic contributions including hydrogen bonding and van der Waals interactions of all energetically favorable coformers of each component. The computations were done using COSMOtherm software [52] at semiempirical level provided by BP_SVP_AM1_C30_1501.ctd parameter file. This semiempirical approach is fairly reliable offering balance between costs of computation and accuracy of thermodynamic characteristics. The number of structure considered here prevents from using more sophisticated levels of computations, especially that it is vital to include several thermodynamically favorable coformers of each compound for adequate sampling of the intermolecular interactions in the modeled liquids. Geometries of all amides and coformers were optimized using MOPAC2012 [53] both in the gas phase and in condensed phase modeled with an aid of the conductor like screening model for real solvents (COSMO-RS) [54, 55] approach. All systems were mixed with unimolar proportions. It is worth mentioning that the highest probability of cocrystallization is assumed for cases with negative enough H mix values [45, 52]. Here this threshold was set to −1.30 kcal/mol.

Results and discussion

The cocrystallization propensities of aromatic and heteroaromatic amides are quite well recognized, which are documented by numerous records deposited in the CSD. However, there are no studies that systematically characterize the whole group of these compounds against a particular set of coformers. This paper intends to fill this gap by a methodical and extended comparison of cocrystallization landscapes for inferring practical rules enabling screening by analogy. In the first part of the paper, such criterions were formulated and validated. Then, consequences of the observed patterns let for a significant extension of the list of probable cocrystals formed by studied class of compounds by enumerating several examples not verified experimentally but very plausible.

Similarities of mixing properties under supercooled conditions

It is quite expected that structural similarities of any two compounds have also consequences on some kind of similarities in their intermolecular interactions. This includes the potential of intermolecular complexes formation, which is the necessary condition for cocrystallization. One can, however, raise the question about the quantification of the similarities between distinct molecules. Since mixing enthalpy is often used as a first sign of potential stability of molecular complexes [45, 52], its value seems to be a quite natural index of similarity of compounds affinities in the context of homogeneity of condensed phases. For verification of this hypothesis, series of pairs involving amides and other coformers were considered, for which H mix values were computed and used as a quantitative measure of likenesses between different compounds. The main idea behind such computations is the hope of finding a representative accounting for the properties of another compound. Interestingly, this seems to be feasible which can be directly inferred from the linear trends presented in Fig. 1. For clarity only the most frequently occurring amides were used on this graphical presentation. Indeed, the selected seven amides are involved in 258 of 356 cocrystals found in the CSD. In the legend of Fig. 1, there are provided correlations between H mix distributions of each of the selected amides with respect to isonicotinamide, which was set as the referential molecule. Choosing this particular compound for linear trends identification is justified by the fact that it is the most frequently occurring cocrystal former among all the considered aromatic or heteroaromatic amides. In Fig. 1, there are presented distributions of excess enthalpies for pairs resulting from all combinations of amides and coformers belonging to the first set. This of course includes many systems not necessarily studied experimentally. In fact, this figure offers quite extended screening of cocrystallization propensities of studied amides with all potential coformers belonging to the first set. The H mix values characterizing existing cocrystals were marked with red color. These points represent such pairs for which experimental data are available for both isonicotinamide and given amide, which is not so common.

Fig. 1
figure 1

Correlation of excess enthalpy distributions of (2) nicotinamide, (3) pyrazinamide, (4) temozolomide, (5) benzamide, (6) picolinamide, (7) 4-hydroxybenzamide and (8) 2-ethoxybenzamide as a function of H mix of (1) isonicotinamide. Marked points (in red) characterize real cocrystals found in the CSD. Brackets in the legend contain the values of Spearman’s ranks (σ) quantifying correlations between distributions (Color figure online)

The existence of highly linear relationships identified in Fig. 1 suggests that intermolecular interactions of one amide with the considered set of coformers can be the source of information about affinities of another one toward the same set of probing molecules. Slopes of these trends can be used for general quantification of components affinities. For example, the inclination of regression line shown by temozolomide is equal to 1.21, while for benzamide is much lower and equals 0.84. This suggests that intermolecular interactions of the former compound with considered set of coformers are stronger compared to isonicotinamide and for benzamide the opposite conclusion is valid. This implies that if isonicotinamide can cocrystallize with given coformer, then it seems to be very plausible that also temozolomide will have the same ability. However, inferring by analogy about the possibility of benzamide cocrystallization based on similarity to isonicotinamide is not so straightforward. Since lower values of excess enthalpy for systems comprising benzamide are expected, the values of H mix should be checked against the threshold one. In general, in such situations application of screening by analogy should be done in a reversed manner and inferring about cocrystallization of isonicotinamide based on trends of benzamide seems to be more reasonable.

Additionally, it is worth to emphasize some quite interesting properties of 4-hydroxybenzamide observed in Fig. 1, for which two distinct patterns are clearly visible. Separation of these cases leads to the conclusion that one set is characterized by a very steep slope, with corresponding value exceeding 1.8, while the other trend is much less inclined with corresponding slope value close to 0.5. This can be explained by the differences in nature of two substituents constituting 4-hydroxybenzamide. The interactions of the hydroxyl group can be associated with higher values of the slope, which suggests that these kinds of interactions favor interaction between coformers much stronger than the amide group. Furthermore, the substituent effect of hydroxyl group on amide interactions can also be observed. Comparison of H mix distribution of 4-hydroxybenzamide with unsubstituted benzamide leads to the conclusion that a significant reduction in the affinity of the amide group of the former compound can be associated with the presence of an OH substituent. These incongruent trends characterizing 4-hydroxybenzamide is the reason of excluding this compound from the procedure of screening by analogy.

Alternatively, the analysis of affinities similarities between considered amides interacting with the same set of coformers can be completed by inspection of distributions of H mix values. In Fig. 2, there are provided plots of smoothed histograms clearly demonstrating that for many amides one can expect very similar cocrystallization propensities. All distributions except the ones characterizing 4-hydroxybenzamide seriously overlap in the whole presented range including the most important region of H mix < −1.30 kcal/mol. Since histograms presented in Fig. 2 clearly document that the analyzed distributions failed in fulfilling the requirement of normality, the quantification of regression lines shown in Fig. 1 requires the utilization of nonparametric correlation coefficient. Hence, Spearman’s rank (σ) was used as a measure of correlations instead of Pearson’s correlation coefficient (R 2) as more adequate here, since this statistical measure of correlation relies on the ranking of values rather than means or standard deviations. These two ways of analysis of H mix distributions provided in Figs. 1 and 2 consentaneously confirm that there are serious similarities in the affinities of several amides interacting with a common set of coformers. Based on this observation, the main claim of this work can be drawn. It states that due to the observed similarities the transferability of cocrystallization propensities between different compounds can be expected. The proper selection of reference compound, as isonicotinamide here, can help in rationalizing the choice of coformers for experimental cocrystals screening. Particularly, in the cases of high enough affinities between coformers, the cocrystallization of one amide with given coformer can be the sign of similar properties of the another one.

Fig. 2
figure 2

Smoothed histograms documenting the distributions of mixing enthalpies of (1) isonicotinamide, (2) nicotinamide, (3) pyrazinamide, (4) temozolomide, (5) benzamide, (6) picolinamide and (7) 4-hydroxybenzamide with every compound belonging to the first set of 211 coformers

Taking into account this message, the comprehensive selection of binary systems was prepared and part of it is presented in Table 2. A much more extended list can be found in supplementary materials in Table S1. These tables compile several examples of known cocrystals augmented with systems marked as potentially positive in the case of fulfilling the requirements of high probability of cocrystallization (H mix < −1.30 kcal/mol). For example, succinic acid cocrystallizes with all seven amides included in Table 1 and this is nicely supported by the predicted H mix values since all of them are within the range of high cocrystallization probability. Fumaric acid was successfully cocrystallized with six of the considered amides, but not with temozolomide. Quite high affinity between these two compounds, suggested by H mix values, makes it very plausible that these two coformers will also form molecular complex in the solid state. This suggestion is additionally supported by higher than one value of the slope of linear regression line presented in Fig. 1. This directly supports rationality of screening by analogy hypothesis. One can find much more similar cases. Indeed, isonicotinamide can cocrystallize with several carboxylic acids, but molecular complexes of temozolomide with these coformers were not reported in CSD. This observation resulting from the highlighted high similarities between two mentioned amides can be used as a suggestion for experimental validation of the proposed hypothesis. The inspection of Table 2 and S1 can lead to many more such suggestions. For example, according to proposed rule, 4-nitrobenzoic acid should cocrystallize with nicotinamide, 2-ethoxybenzamide and temozolomide. In general, the selection of isonicotinamide as reference molecule, and transferring its cocrystallization abilities is practical for those amides for which slopes of linear trends observed in Fig. 1 are higher than unity. Based on proposed hypothesis, one can also find other referential molecules. For example, vanillic acid cocrystallizes with pyrazinamide, but analogous cocrystal of isonicotinamide was not reported in CSD. Also the existence of homogenous solids of 3,5-dinitrobenzoic acid with all amides except from picolinamide can be inferred based on their similarities with respect to benzamide. All corresponding H mix values are within acceptable range.

Table 2 Shortened version of the list of cocrystals formed by (1) isonicotinamide, (2) nicotinamide, (3) pyrazinamide, (4) temozolomide, (5) benzamide, (6) picolinamide and (8) 2-ethoxybenzamide

However, fair caution should be stated based on the collection presented in Table 3. It accumulates some exemplary cases, for which the value of H mix used for cocrystal screening can be misguiding. For example, there were reported cocrystals of temozolomide with some of considered here amides despite the fact that according to uncritical inference based on excess heat values such binary systems should rather exhibit immiscibility in the solid state. Occurrence of such false negative cases is an inherent problem of application of H mix to highly similar components. Anyway, the excess function cannot be used for single component systems, for which by definition H mix = 0. Thus, the affinities of two compounds can be so similar that inferring based on excess function must fail. Probably some false positive examples might also be found. Despite these limitations, the proposed approach can be used for augmenting the virtual cocrystal screening by taking advantage of analogy. The generality of this rule can be easily confirmed or falsified by performing series of experimental screenings, which will be the subject of forthcoming projects.

Table 3 List of some falsely predicted cocrystals based on similarities of H mix values

Validation of similarities criterion based on extended coformers list

The first set comprised compounds that have already been used for cocrystallization with aromatic or heteroaromatic amides. It would be also interesting to check whether similar correspondence in mixing abilities between amides and isonicotinamide is also observed for other types of coformers. For this purpose, the second set was constructed including compounds found on the EAFUS and GRAS lists and a similar analysis was performed as for the first set of coformers. Interestingly, high linear trends were again observed also for coformers included in the second set. For keeping consistency with the data provided in Fig. 1, the obtained results were compacted to relationships between computed values of Spearman’s ranks and slopes of regression lines of H mix distributions of a given amide with respect to isonicotinamide. The data presented in Fig. 3 suggest that for both sets of coformers high linear correlations can be found between affinities of isonicotinamide and many amides. Although there are some exceptions, the obtained distributions can be characterized by σ > 0.9 for the majority of considered amides. Thus, high values of Spearman’s ranks suggesting high linearity can be found for different sets of coformers and are not just an artifact of selecting a single series of probing molecules.

Fig. 3
figure 3

Correlation between Spearman’s rank values and slopes of linear regression lines of H mix distributions of each of studied amides with respect of isonicotinamide

The abscissa of Fig. 3 provides characteristics of the relative affinities of amides toward coformers with respect to isonicotinamide. As it was mentioned above, the proposed rule of cocrystal screening based on analogy requires finding a reference molecule for which cocrystals are known. However, such inferring will have a chance of success if it is applied to compounds of higher affinities than the reference molecule that is with higher values of slopes of regression line. It is possible to find many of such cases for both sets of coformers. It is worth mentioning that no restriction was imposed on the data during computation of both the values of slopes and Spearman’s ranks. Thus, from one side the plots provided in Fig. 3 offer guidance for rational selection of the most promising candidates for reference molecules. On the other hand, it informs the restrictions of applicability of screening by analogy approach. The value of slopes of three selected amides is included in Fig. 3. Thus, selection of benzamide as a referential molecule for cocrystals screening of other two amides is justified, but predications in the opposite direction can lead to misclassification.

Conclusions

The idea of a likeness of different compounds in agreement with chemical intuition is extended here on the cocrystallization propensities. Particularly, the similarity of liquids miscibility under supercooled conditions was proposed as a measure of correspondence between cocrystallization landscapes of different cocrystal formers. Presented data provided quite comprehensive screening of cocrystals of aromatic or hetero-aromatic amides with a variety of conformers. It has been demonstrated that many of distributions exhibit highly similar patterns in the whole range of excess enthalpy, which was nicely confirmed by experimentally observed cocrystals. This suggests that affinities of one component toward a given set of coformers can inform about affinities of another chemical species interacting with the same set of probing molecules. Due to analogies in the intermolecular interactions in liquids such similarities can be extended also on cocrystallization abilities. This is the sense of the hypothesis proposed here, which states that a properly selected reference molecule, for which cocrystals were experimentally documented, can provide practical information about cocrystallization propensities of another compound provided that two criterions are met. First condition requires high similarities, which can be expressed in terms of correlations between excess heat distributions obtained based on the common set of conformers. Second criterion requires that estimated H mix value is within accepted range of high probability cocrystallization. Despite the fact that there were identified some systems misclassified as false negatives, many positive examples were enumerated. The main conclusion drawn from presented analysis is that it is not necessary to perform experimental cocrystallization of every pair of coformers since miscibility in the solid state of one compound can be transferred to another one at least in the case of aromatic or hetero-aromatic amides. However, generality of conclusion is worth further exploration both theoretical and experimental.