Introduction

A graph-set for the determination of hydrogen bonding patterns was developed many years ago [1, 2]. The graph-set descriptor, G d a(n), designates a pattern of hydrogen bonds, G = {D, S, C, R} where a is the number of acceptors, d is the number of donors, and n is the number of atoms in the pattern, also known as the degree of pattern. When the pattern contains only one type of hydrogen bond it is called a motif and designated by the unitary graph-set. If multiple hydrogen bonds are involved in the pattern, then a complex (binary, ternary, etc.) graph-set describes the pattern, e.g., C2 2(6) and C2 2(10) (Fig. 1). This graph-set method was applied into the Cambridge Structural Database programs for visualization and characterization of non-covalent networks in molecular crystals [3, 4].

Fig. 1
figure 1

The selenate anion forms a a chain pattern along the b axis and b an undulating ribbon-like pattern along the (\( \bar{1}01 \)) direction, and c a cross-section of the patterns viewed along a-axis

The direction of a hydrogen bond can be designated by an arrow. This concept was introduced by Grell et al. [5, 6]. Two symbols, > and <, have subsequently been implemented into the MERCURY program [4]. The symbol > means that the hydrogen bond is created from donor to acceptor, and the symbol < has opposite meaning. For example, if the chain pattern C2 2(6)>a<b is formed by two hydrogen bonds a and b, it means that the hydrogen bond a is constructed from donor to acceptor and the second hydrogen bond b has the same acceptor as a.

However, when the crystal grows, molecules are arranged into consecutive layers and similarly the subsequent hydrogen bonds are created like a wall that is growing during bricklaying. Therefore, the pattern of hydrogen bonds can be considered as a set of individual hydrogen bonds, where the molecular units are involved in the pattern. Since each pattern is described by the graph-set descriptor, G d a(n), the question arises whether the individual descriptors can be added to each other resulting in the descriptor of complex graph sets or not. A real example of an existing structure was chosen to answer this question.

In the crystal structure of bis(2-aminopyrimidinium) selenate monohydrate, a complex hydrogen bonding network occurs. On the basis of this particular structure, it is shown that the molecular graph-set descriptors can be added to each other resulting in complex graph-set descriptors for the hydrogen bonding network. Additionally, a mathematical formula is presented.

Experimental

Synthesis

The starting compounds, 2-aminopyrimidine [Aldrich, purum, ≥98% (NT)] and selenic acid (Aldrich, 40 wt% in H2O, 99.95%) were used as supplied. The dilute acid solution was added to the hot water solution of 2-aminopyrimidine in a 1:1 molar ratio. After the solution was cooled to room temperature, it remained clear without any precipitate. The solution was slowly evaporated at room temperature for several days until it resulted in the formation of good quality single crystals. The following tentative assignments can be done for IR and Raman Spectra. IR (Nujol mull, cm−1): 3272 s, 3066 s (ν N–H···N), 3124 s, 3110 s, 3035 s (ν C–H), 1683 s sh, 1652 vs (δ N–H), 1626 vs (δ H2O), 1577 m sh, 1561 m, 1538 m, 1507 w (ν C–N and ν C–C), 1293 m (δ N–H···N), 1193 w, 1182 w, 1141 w, 1121 vw (ν C–N and ν C–C), 1039 w/1028 w (γ N–H···N), 1008 vw, 1003 vw (ν C–C), 881 vs sh, 865 vs (ν3 SeO4 2–), 876 vs (ν3 SeO4 2– and ring symmetric breathing), 835 vs (ν1 SeO4 2–), 698 m, 684 m, 603 w (lib H2O), 426 vs sh, 420 vs, 407 s, 398 m, 389 m (ν4 SeO4 2–), 356 w/345 w (ν2 SeO4 2–), 213 w, 188 m, 174 m, 161 m, 154 m, 135 m, 115 w, 95 m, 83 w, 72 w, 68 w, 57 w (lattice vibrations). Raman (cm−1): 3124 vw, 3101 vw, 3093 vw, 3032 vw (ν C–H), 1626 vw (δ H2O), 1536 w (ν C–N and ν C–C), 1290 vw (δ N–H···N), 1191 vw, 1141 vw, 1117 vw (ν C–N and ν C–C), 1009 vw (ν C–C), 876 vs (ν3 SeO4 2– and ring symmetric breathing), 864 w sh (ν3 SeO4 2–), 834 w (ν1 SeO4 2–), 422 w, 414 w, 401 vw, 390 w (ν4 SeO4 2−), 354 w, 337 vw (ν2 SeO4 2−), 198 vw, 156 w, 105 s (lattice vibrations).

Single crystal X-ray diffraction studies

X-ray diffraction data were collected on a KUMA Diffraction KM-4 four-circle single crystal diffractometer equipped with a CCD detector using graphite-monochromatized Mo Kα radiation (λ = 0.71073 Å). The raw data were treated with the CrysAlis Data Reduction Program (version 1.172.30.3) taking into account an absorption correction. The intensities of the reflection were corrected for Lorentz and polarization effects. The crystal structure was solved by direct methods [7] and refined by full-matrix least-squares method using the SHELXL-97 program [7]. Non-hydrogen atoms were refined using anisotropic displacement parameters. H atoms were placed in calculated positions and allowed to ride on the parent atom. Uiso(H) was set to 1.2Ueq(C, N) or 1.5Ueq(O).

Results and discussion

Crystal structure packing of bis(2-aminopyrimidinium) selenate monohydrate

The studied compound crystallizes in the monoclinic P21/n space group (Table 1). An asymmetric unit comprises a water molecule, two 2-aminopyrimidinium cations, Hampy+, which balance the negative charge of the selenate anion (Scheme 1). All these species are involved in the hydrogen bonding network (Table 2). The most obvious chain pattern of hydrogen bonds is formed by the water molecules and the selenate anions (Fig. 1a). This chain extends along the b-axis and is designated by the binary graph-set C2 2(6). On the other hand, the SeO4 2− anion is also involved in forming two ring patterns with two symmetry-independent organic cations. The ring pattern is generated with the hydrogen bonds formed between the aromatic N–H, amino, and selenate anions, respectively (Fig. 1b). Both rings are described by the binary graph-set R2 2(8) and they have common Se and O(1) atoms from the selenate anion. The two 2-aminopyrimidinium cations and one selenate anion form the structural unit (Hampy)2SeO4. The binary graph-set C2 2(6) (Fig. 1a) and both the binary graph sets R2 2(8) (Fig. 1b) intersect at the selenate anion. Additionally, the water molecule, which is a part of the C2 2(6) chain pattern, also connects the (Hampy)2SeO4 units together, resulting in an undulated ribbon-like pattern of hydrogen bonds along [\( \bar{1}01 \)] direction (Fig. 1b). Therefore, all the species of bis(2-aminopyrimidinium) selenate monohydrate are involved in the ribbon (Fig. 1b). In this complex molecular arrangement four chains of fourth level are found: C4 2(10), C4 3(12), C4 3(10), and C4 3(14) (Fig. 1b). The last two chains arises from the fact that the Se–O(1) bond simultaneously belongs to two R2 2(8) rings.

Table 1 Crystal data and structure refinement for bis(2-aminopyrimidinium) selenate monohydrate
Scheme 1
scheme 1

ORTEP drawing of bis(2-aminopyrimidinium) selenate monohydrate, with the atom numbering scheme. Displacement ellipsoids for non-H atoms are drawn at the 50% probability level [8]

Table 2 Hydrogen bonds (Å, °) for studied compound

In summary, four chain patterns are incorporated into the ribbon. However, the ribbon results from the intersection of the binary C2 2(6) and R2 2(8) patterns rather than four tangled chain patterns. The ribbon propagates along the [\( \bar{1}01 \)] direction and is connected to the adjacent ribbon in the direction where the C2 2(6) pattern occurs. Thus, a layered structure of the hydrogen bonding network parallel to the (101) plane is created (Fig. 1c), however, the layers are not interconnected by hydrogen bonds. As a consequence of this layered structure, the crystals of the (Hampy)2SeO4·H2O can be easily cleaved in the (101) plane. It appears that the layered structure results from the tetrahedral structure of the SeO4 2− anion along with its ability to form hydrogen bonds in different directions.

Construction of complex hydrogen bonds

When the crystal grows, molecules are built into successive layers, and subsequently hydrogen bonds are created successively as well. According to the Etter’s notation [1], the hydrogen bonding patterns can be described by G = {D, S, C, R} graph sets. However, if each hydrogen bond is considered independently, i.e., as a building block which results in a pattern, then each hydrogen bond can be described by a finite graph-set, F. Thus, the F graph-set is a subgraph-set of graph-set G and therefore one can say that every chain or ring pattern is constructed by the set of F graph sets, (F1, …, F i ) ⊂ G. For example, if G is represented by the discrete graph-set D, then i = 1 and F = D, because of the definition of D [2, 5].

A unitary finite graph-set F1 1(2) = F is always found, because in every pattern of hydrogen bonds a hydrogen atom and its acceptor exist. Thus, one can say that the symbol of the complex graph-set G can be obtained by the exact summation of the unitary graph sets F. However, this summation fails, because some atoms between adjacent finite patterns exist, which do not belong to any of them. So, they are not included in the complex G d a(n) graph-set during summation.

Let us consider a binary F d a(n) graph-set. In such a case, the most important feature is that it contains two hydrogen bonds at both ends. For instance, F2 2(8) and F2 1(3) graphs are parts of a chain (Fig. 2a) found in the structure of ethylene–1,2-diammonium bis(hydrogen succinate) [9]. If only an exact summation of acceptors, donors or the number of atoms associated with the symbols of two graph sets is considered, F2+2 2+1(8 + 3) = C4 3(11), it results in an incorrect descriptor, because the binary graph sets F2 2(8) and F2 1(3) have a common part, i.e., one hydrogen atom and one acceptor (see Fig. 2a). However, it can be perceived that this common part also creates a graph-set, F1 1(2), and is included in the chain pattern twice. On one hand, F2 2(8) has a common part with the F2 1(3), while the F2 1(3) intersects with the next F2 2(8). Therefore, the F1 1(2) graph-set ought to be subtracted twice. As a result, a simple mathematical operation F2 2(8) + F2 1(3) − 2F1 1(2) gives the proper graph-set: F2+2−2×1 2+1−2×1(8 + 3 − 2 × 2) = C2 1(7).

Fig. 2
figure 2

The chain pattern of a ethylene–1,2-diammonium bis(hydrogen succinate), C2 1(7), and b bis(2-aminopyrimidinium) selenate monohydrate, C2 2(6)

These facts can be summarized as follows:

  1. 1.

    When two binary F d a(n) graph sets are added to each other, then the F1 1(2) graph-set should be subtracted

  2. 2.

    The beginning and the end clearly define a finite pattern, which is described by an open, i.e., finite, graph-set. The F1 1(2) graph-set should then be subtracted m − 1 times from m added binary F d a(n) graph sets according to the following general equation:

$$ {G_{d}}^{a} (n) = \sum\limits_{i = 1}^{m} {{F_{{d_{i} }}}^{{a_{i} }} (n_{i} ) - (m - 1){F_{1}}^{1} (2)} ,\;{\text{where G}} = \left\{ {\text{F}} \right\} $$
(1)
  1. 3.

    Ring and chain patterns always create a closed graph-set from the topological point of view—the end is connected with the beginning. Therefore, the F1 1(2) graph-set should in this instance be subtracted m times according to the following general equation:

$$ {G_{d}}^{a} (n) = \sum\limits_{i = 1}^{m} {{F_{{d_{i} }}}^{{a_{i} }} (n_{i} ) - m{F_{1}}^{1} (2)} ,\;{\text{where G}} = \left\{ {{\text{C}},{\text{ R}}} \right\}. $$
(2)

For the bis(2-aminopyrimidinium) selenate monohydrate, several finite F graph sets construct chains and rings (2apy = 2-aminopyrimidinium ion):

  1. 1.

    F2 2(5)water + F2 2(5)selenate − 2F1 1(2) = C2 2(6) chain (Figs. 1a, 2b),

  2. 2.

    F2 1(3)selenate + F2 2(5)2apy + F2 1(3)water + F2 2(7)2apy − 4F1 1(2) = C4 2(10) chain (Fig. 1b),

  3. 3.

    F2 2(5)2apy + F2 2(5)selenate + F2 2(7)2apy + F2 1(3)water − 4F1 1(2) = C4 3(12) chain (Fig. 1b),

  4. 4.

    F2 2(7)2apy + F2 2(5)selenate − 2F1 1(2) = R2 2(8) both rings (Fig. 1b), etc.

Direction of hydrogen bonds in graph-set descriptors

If the direction of the hydrogen bond is relevant for one’s purposes, the above algorithm for seeking complex patterns can also be used. However, the symbol m preceding the F1 1(2) descriptor should then not be used in Eqs. 1 and 2, because different finite patterns F1 1(2) will have different directionality. For instance, in bis(2-aminopyrimidinium) selenate monohydrate the following chain descriptor with graph-set C4 2(10) is created and the finite patterns are: F2 1(3)>a<f, F2 2(5)<f>e, F2 1(3)>e<c, F2 2(7)<c>a, and unitary F1 1(2)<f, F1 1(2)>e, F1 1(2)<c, and F1 1(2)>a (Scheme 2a). All these F d a(n) graph sets should be written here in the order of atomic path and the directionality descriptors should also be subtracted in the order associated with the atomic path, because this is the only way the correct chain descriptor can be obtained. Thus, the use of this algorithm results in a chain descriptor C4 2(10)>a<f>e<c.

Scheme 2
scheme 2

Complex chain C4 2(10) results from a the summation of the binary finite graph sets Fd a(n) based on Eq. 2 (direction of hydrogen bonds is included) and b the summation of the elementary graph sets Ed a(n)

Predicting a graph-set descriptor

A complex graph-set of hydrogen bonding patterns can be constructed by individual molecular fragments taking into account presumable acceptor and donor sites. Let us now consider an isolated molecule where a simple atomic path runs through the molecule from one acceptor/donor atom to another one. For instance, a three-atomic path H–O–H and two two-atomic paths H–O are found in a water molecule. Now, all the atomic paths can be described by elementary graph sets. Let these be E d a(n), distinguishing this molecular graph-set from the hydrogen bonding graph-set F d a(n), which describes an interaction. In the E d a(n) symbol, a and d are associated with the number of acceptors and donors located only at both ends of atomic path, respectively. So, they do not define the total number of acceptors and donors located in the whole path and thus (a, d) = {0,1,2}. As a consequence of this definition, the E d a(n) is a two-node graph-set and the following graph sets can be found in each molecule {E0 1(1), E1 0(1), E0 2(n), E2 0(n), E1 1(n)}. For instance, three-atomic path H–O–H and the two two-atomic path H–O are described by elementary graph sets E2 0(3) and two E1 1(2), respectively. Additionally, a one-atomic path can also be described, because one acceptor/donor atom itself can participate in a bifurcated hydrogen bonding interaction. Thus, the one-atomic elementary graph sets are E0 1(1) and E1 0(1). In the case of the SeO4 2− anion, four oxygen atoms are treated as presumable acceptor sites in the hydrogen bonding network. So, in the selenate anion four one-atomic paths and six three-atomic paths are found which are described by graph sets E0 1(1) and E0 2(3), respectively.

Let us now consider the water molecule and the selenate anion together to find out what kind of hydrogen bonding patterns can be constructed by those two species. A set of elementary graph sets for H2O and SeO4 2− are G1 = {2E1 0(1), E0 1(1), 2E1 1(2), E2 0(3)} and G2 = {4E0 1(1), 6E0 2(3)}, respectively. Now, a graph-set descriptor of a particular hydrogen bonding pattern can be constructed summing the elementary molecular graph sets G1 and G2. It is worth noting here that addition of the molecular graph sets is a mathematical expression which corresponds to the creation of the hydrogen bonding interactions. For H2O and SeO4 2−, the following hydrogen bonding patterns are obtained:

  1. 1.

    E1 0(1) + E0 1(1) = {C1 1(2), D1 1(2)} (Fig. 3a, b),

    Fig. 3
    figure 3

    Chain and ring patterns created by the water molecule and the selenate anion a C1 1(2), b D1 1(2), c C1 2(4), d R1 2(4), e C2 1(4), f R2 1(4), g C2 2(6), h R2 2(6)

  2. 2.

    E1 0(1) + E0 2(3) = {C1 2(4), R1 2(4)} (Fig. 3c, d),

  3. 3.

    E2 0(3) + E0 1(1) = {C2 1(4), R2 1(4)} (Fig. 3e, f),

  4. 4.

    E2 0(3) + E0 2(3) = {C2 2(6), R2 2(6)} (Fig. 3g, h).

Overall, a search of possible graph-set descriptors for hydrogen bonding patterns consists of the addition of each element of the G1 set and each element of the G2 set. However, this procedure sometimes gives an unrealistic result. For instance, the summation of E0 1(1)water + E0 2(3)selenate = {C0 3(4), R0 3(4)} is unrealistic, because the resulting patterns do not involve donors. Thus, the question arises now: When does the summation give a real result? Since the molecules organize themselves in the order donor atom  acceptor atom, the molecular graph sets E d a(n) should also have the same order. Therefore, two molecules should create chain or ring patterns as follows:

  1. 1.

    E1 0(1) + E0 1(1) = C1 1(2)—in this case careful consideration of the chain pattern only makes sense because a “ring” is de facto a discrete pattern, D1 1(2),

  2. 2.

    E1 0(1) + E0 2(n) = {C1 2(n + 1), R1 2(n + 1)},

  3. 3.

    E2 0(n) + E0 1(1) = {C2 1(n + 1), R2 1(n + 1)},

  4. 4.

    E2 0(n 1) + E0 2(n 2) = {C2 2(n 1 + n 2), R2 2(n 1 + n 2)}.

  5. 5.

    E1 1(n 1) + E1 1(n 2) = {C2 2(n 1 + n 2), R2 2(n 1 + n 2)}.

In the light of the above considerations, the order of the molecular graph sets seems to be trivial. However, if it is not preserved, then the chain or the ring pattern of the hydrogen bonding pattern is not created. For instance, E0 1(1)water + E2 0(5)2apy + E0 1(1)selenate + E2 0(3)2apy graph sets result in the chain C4 2(10) depicted in Fig. 1b and Scheme 2b. If they are permuted E0 1(1)water + E0 1(1)selenate + E2 0(5)2apy + E2 0(3)2apy, they do not create a chain graph-set because the water molecule and the selenate anion do not form a hydrogen bond here. The same logic can be applied for the two 2-aminopyridinium cations.

In summary, a set of molecular graph sets can be found in each molecule Gi = {E0 1(1), E1 0(1), E0 2(n), E2 0(n), E1 1(n)}, where i is the number of interacting molecules. If many molecules interact with each other resulting in a chain or ring pattern, then the symbols of the molecular graph sets can be added to each other alternatively moving along the arrows from D(onor) to A(cceptor) area presented in Scheme 3. To obtain a descriptor of a chain or a ring pattern the last added elementary graph-set must be connected with the first one. Otherwise, a finite graph-set is constructed. As a consequence of the movement among the elementary graph sets shown in Scheme 3, it can be observed that the resultant descriptor of a chain or a ring pattern {C d a(n), R d a(n)} always possesses the following feature:

$$ \left| {a - d} \right| \le \left\{ {\begin{array}{*{20}c} {\frac{i}{2},\;{\text{for}}\;{\text{even}}\;i} \\ {\frac{i - 1}{2},\;{\text{for}}\;{\text{odd}}\;i} \\ \end{array} } \right. $$
Scheme 3
scheme 3

Allowed summations of elementary graph sets

Conclusions

A proper graph-set descriptor of hydrogen bonding patterns is usually obtained by simple counting of atoms. However, the pattern is a set of individual hydrogen bonds arranged in accordance with the symmetry elements. These separate hydrogen bonds are described by finite graph sets, F d a(n), and they can be added to each other resulting in a graph-set C d a(n) or R d a(n) associated with complex chain or ring patterns, respectively. This mathematical concept also takes into account an excessive graph-set F1 1(2) being an intersection of graph sets F d a(n), which is subtracted as shown in Eqs. 1 and 2.

In this article, Etter’s notation is used for the description of individual molecules by elementary graph sets, E d a(n). It was shown, that each molecule can be described by the set of five elementary graph sets, Gi = {E0 1(1), E1 0(1), E0 2(n), E2 0(n), E1 1(n)}. The creation of hydrogen bonds is then expressed by the summation of elementary graph sets resulting in a graph-set descriptor for a chain or a ring hydrogen bonding pattern. A scheme of allowed summations of graph sets has been given.

However, since this mathematical concept does not differentiate atoms from molecules, some complex patterns cannot be formed in a real crystal because of steric hindrance. For instance, it is impossible to form a R2 2(11) ring pattern with two amino groups of a 1,4-phenylenediammonium cation and two oxygen atoms of a trifluoroacetate anion, despite the fact that the summation of the elementary graph sets yield E2 0(8) + E0 2(3) = R2 2(11). However, a chain graph-set C2 2(11) can be found in the crystal structure of 1,4-phenylenediammonium trifluoroacetate [10].

Supplementary information

CCDC-707763 contains the supplementary crystallographic data for this paper. These data can be obtained free of charge via www.ccdc.cam.ac.uk/conts/retrieving.html (or from the Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB21EZ, UK: fax: +44 1223 336033; or deposit@ccdc.cam.uk).