1 Introduction

Over the decades, hydrogen-bonded complexes have been the subject of detailed theoretical and experimental investigations due to the role they play in chemical reactions and biological systems. Highly accurate and thus fully reliable theoretical values of interaction energies and interaction-induced properties can be obtained within the coupled cluster singles and doubles including connected triples corrections method (CCSD(T)) [1] combined with sufficiently large and diffuse sets of basis functions. However, the CCSD(T) method scales with the number of basis functions N as N 7, and thus its computing cost strongly delimits the size of systems that can be investigated. The high computational cost of the CCSD(T) method still remains an issue despite of the fast development of computing resources and algorithms. The problem can be tackled either through a reduction in the value of N, i.e., using smaller basis sets or through a decrease in the power of N, i.e., using a lower cost method. Both solutions are commonly used in theoretical investigations. The main drawback of using smaller basis sets or lower quality methods lies in the fact that they may not be flexible enough to describe the subtle intermolecular interactions. Considering the basis set size problem, efficient reduced-size basis sets designed for calculations of particular type of properties are developed. Such property-oriented basis sets, although not able to describe all the properties of a system as accurately as large all-purpose basis sets, compete with these in the evaluation of the given class of properties for which they are designed.

The idea of basis sets tailored for calculations of particular properties dates back to 1937 and to the London-type orbitals (LAOs) [2]. Another example of property-oriented basis sets is the so-called polarized basis sets (Pol sets) developed for calculations of electric properties and originating from a simple physical model of a harmonic oscillator embedded in an external electric field [3]. Polarized sets have long tradition and span from the pioneer Pol basis sets [46], through the compact ZPol sets [79], to the recently reported LPol basis sets [10]. The ZPol basis sets are recommended for moderately accurate calculations of linear electric properties in large molecular systems [79]. The LPol sets have been shown to compete with the much larger Dunning’s sets in the evaluation of electric properties and of specific optical rotation in organic molecules [1012].

Interaction energies and induced properties in large hydrogen-bonded complexes are often evaluated using second-order Møller–Plesset perturbation theory (MP2) [13] that scales as N 5. Additionally, the basis set size is kept as small as possible, considering the aimed accuracy. However, it should be stressed that the neglection of higher-order electron correlation contributions, especially in the evaluation of non-linear induced electric properties, may cause a significant deterioration in the results. This cannot be ignored when a highly accurate estimation of the effects is required. It is thus of great importance to estimate the limits of applicability of MP2 or other lower-order approximations in the evaluation of induced properties in hydrogen-bonded systems.

An additional problem in the evaluation of interaction energies and properties is the basis set superposition error (BSSE), present in all calculations carried out within the supermolecular approach. The correction for it is mandatory in the majority of the cases where high accuracy is required. This means a considerable increase in computing demands that gets larger with the number of monomers present in the molecular complex. The minimization of this error is also a challenge in the development of new basis sets.

In the present study, we employ the LPol-n (n = ds, fs, dl, fl) sets and four levels of theory, namely HF SCF, MP2, CCSD and CCSD(T), for the calculation of the non-zero components of the induced electric properties (up to the induced electric dipole second hyperpolarizability) in the linear (HCN) m (m = 2–4) complexes. Additionally, we evaluate the interaction energies. The linear (HCN) m clusters have been the subject of numerous theoretical and experimental investigations. Here, we give a brief summary of those most relevant to our work. The theoretical studies used mainly the HF SCF and the MP2 approximations [14, 15]. King et al. [14] evaluated the intermolecular energies of the linear (HCN) m (m = 2–7) clusters. They used the RHF and the MP2 methods with a 6-31+G* basis set. The dipole moments of the complexes were also obtained. The results were only partially corrected for the BSSE; the authors claimed that the counterpoise corrections are physically realistic at the SCF level, but unreliable at correlated levels.

In 1998, a potential function for the (HCN)2 cluster was developed using the IMPT methodology [16]. The properties of the energy minima are reported for the (HCN)2 and the (HCN)3 clusters, together with HF and MP2 numbers for the dimer. The 6-311G** and a 5S4P2D/3S2P basis sets were used. The authors used the potential to study larger complexes.

The dimer interaction energy was recently evaluated by Li et al. [17] at the MP2/aug-cc-pVTZ level. The reported value was 4.696 kcal/mol. All the results were counterpoise corrected.

Rivelino et al. [18] used a hierarchy of methods (HF, MP2, MP3, MP4, CCSD and CCSD(T)) and up to the aug-cc-pVTZ basis set to evaluate the (HCN)2 interaction energy. For the (HCN)3 and the (HCN)4 complexes, they employed a smaller basis set, i.e., the 6-311++G(d,p), but the (HCN)2 results show that the differences between the aug-cc-pVTZ basis set results and those obtained with the smaller basis set are significant (around 0.5 kcal/mol). Cooperativity values are also reported. With the MP2 and the smaller basis set, they evaluated the cooperativity effects on the dipole moments for the (HCN) m (m = 2–7) complexes.

Several density functional studies are also available (see for example Ref. [19] and references cited therein). In Ref. [19], the (HCN) m complexes with m up to 10 are studied and interaction dipole moments and polarizabilities were evaluated with a 6-311G++3d3p basis set and the BPW91 functional, but the results are far from accurate. No attention has been paid to other electric properties.

The most relevant experimental papers for our purposes are the following. The work by Bhattacharya and Gordy, where the HCN dipole moment in the ground vibrational state is calculated as 2.985 ± 0.005 D [20]. The vibrational ground-state rotational spectroscopic constants and the structure of the HCN dimer were obtained by Buxton et al. [21], that reported a non-linear vibrationally averaged structure and a well depth for the potential of 4.4 kcal/mol. The (\(\hbox{HC}^{15}\hbox{N})_2\) dimer electric dipole moment in the vibrational ground state has been measured by Campbell and Kukolich as 6.552(35) D [22], which resulted in an interaction dipole moment of 0.77 D. The structure of the linear (\(\hbox{HC}^{15}\hbox{N})_3\) trimer was determined from the microwave spectra of 22 isotopic species [23]. The dipole moment was found as 10.6(1) D and the interaction dipole as 1.8 D.

With the use of the LPol-n (n = ds, fs, dl, fl) bases in the study of the (HCN) m (m = 2–4) complexes, we plan to check the performance of the new developed basis sets in the evaluation of interaction energies and induced electric properties in hydrogen-bonded complexes, through the estimation of the counterpoise correction and the error introduced in the calculations with the use of MP2 approximation.

The present paper is organized as follows. Some relevant definitions and the computational details are given in Sect. 2. Section 3 reports the results and discussion, and in Sect. 4 we summarize and conclude.

2 Definitions and computational details

We consider the linear (HCN) m (m = 2–4) complexes, aligned along the z-axis of the Cartesian coordinate system. The HCN bond distances are chosen equal to those reported by Carter et al. [24] for an isolated HCN molecule, i.e., R HC = 1.06501(8) Å and R CN = 1.15324(2) Å. The intermolecular hydrogen bond distances are adopted from the recent work by Adrian–Scotto and Vasilescu [19]. The complete set of molecular parameters is reported in Table 1.

Table 1 Intermolecular hydrogen bond distances in the (HCN) m (m = 2–4) complexes

Interaction energies can be evaluated using different methods. In the present investigation, we employ the so-called supermolecular approach that defines the interaction energy of a dimer, \(\Updelta E_{AB}\), as the difference between the energy of the complex, E AB , and the energies of its subunits, E A and E B :

$$ \Updelta E_{AB} = E_{AB} - E_{A} - E_{B}.\\ $$
(1)

We correct all the results for the BSSE using the counterpoise correction by Boys and Bernardi [25], i.e., both the energy of the complex and the energies of the monomers are calculated in the basis set of the dimer. Definition (1) can be generalized to the case of the induced electric dipole properties by replacing E with P = μ, ααβ, βαβγ, etc. (α, β, γ,… = xyz).

Interaction energies and interaction-induced properties in larger AB···N complexes are calculated here using the site–site counterpoise method, as the respective differences in the energy or electric property of the complex and the energies or electric properties of all N subsystems:

$$ \Updelta P_{AB\cdots N} = P_{AB\cdots N} - \sum_{i=A}^{N} P_{i}. $$
(2)

The energies and the electric properties of the monomers are calculated in the basis set of the complex.

We evaluate the interaction energies, the induced dipole moment and all non-zero components of the induced dipole polarizability and first and second hyperpolarizabilities in the investigated systems within the HF SCF and the correlated (MP2, CCSD and CCSD(T)) approximations. All four available LPol-n (n = ds, fs, dl, fl) basis sets are employed in the study.

The LPol-n bases [10] belong to the family of the polarized sets derived by Sadlej and co-workers over the last decades [49]. The Pol set family, including the ZPol and the LPol sets, is designed for calculations of electric properties in molecular systems. The idea of these polarized sets exploits a simple physical model of a harmonic oscillator perturbed by an external static electric field [3]. The resemblance of the solutions of the harmonic oscillator Schrödinger equation to the commonly used Gaussian-type orbitals in ab initio and DFT calculations leads to the model of generation of the polarization functions augmenting a properly chosen source set of basis functions. Recently, the model has been generalized to the case of a dynamic electric field perturbation [26], which resulted in the development of the LPol-n basis sets [10].

For an isolated system (F = 0, with F denoting the Cartesian vector of the homogeneous static electric field), each primitive Gaussian-type orbital (GTO) {G μ,l (r; R μ (0), αμ)} is a function of the electron coordinate vector r and is fully defined by its origin R μ(0), orbital exponent αμ and the angular momentum quantum number l. In the presence of an electric field, the GTOs become field dependent,

$$ G_{\mu,l}({{\bf r}}; {{\bf R}}_{\mu}({{\bf 0}}),\alpha_{\mu}) \rightarrow G_{\mu,l}({{\bf r}}; {{\bf R}}_{\mu}({{\bf F}}),\alpha_{\mu}). $$
(3)

Thus, the eigenvector u(r;0), being a linear combination of GTOs, becomes a function of the electric field through the basis set functions and the expansion coefficients,

$$ u({{\bf r}};{{\bf 0}}) \rightarrow u({{\bf r}};{{\bf F}}) = \sum_{\mu} c_{\mu}({{\bf F}}) G_{\mu,l}({{\bf r}}; {{\bf R}}_{\mu}({{\bf F}}),\alpha_{\mu}). $$
(4)

The first-order perturbed function u (1)(r;0), referred to as the first-order polarization function, can be written as [7, 26]:

$$ u^{(1)}({{\bf r}};{{\bf 0}}) \sim \sum_{n=1}^{\infty} b_{n}(f_{n-}^{(1)} + f_{n+}^{(1)}), $$
(5)

with

$$ f_{n \pm}^{(1)} = \sum_{\mu} c_{\mu}({\bf 0}) \alpha_{\mu}^{-n + 1 \mp 1/2} G_{\mu,l \pm 1}({\bf r};{\bf R}_{\mu}({\bf 0}),\alpha_{\mu}), $$
(6)

and is added to the source basis set {G μ,l (r;R μ(0), αμ)}. The polarization function contraction coefficients are obtained from the field–independent GTO contraction coefficients through the simple scaling given by Eq. 6. The total size of the resulting polarized set is reduced through (1) an appropriate contraction of the innermost orbitals, (2) generation of the polarization functions only for the valence orbitals and for the outermost primitive GTOs, i.e., those favored by the scaling defined by Eqs. 6 and 3 assuming that the majority of the functions corresponding to the f n components of (5) are already present in the initial set of GTOs.

In general, polarized sets can be generated from any set of GTOs. However, the quality of the source set of functions determines to a large extent the quality of the resulting polarized basis set, and thus the source set should be carefully chosen. In the case of the LPol-n bases, the source set was chosen to be the van Duijneveldt 13s8p (10s for hydrogen) basis set [27], additionally augmented with one diffuse s- and one diffuse p-type functions to a 14s9p set (one diffuse s-type function to a 11s set for hydrogen) to increase the flexibility of the source set. The resulting initial set was contracted after careful atomic tests and augmented with the first- (ds and dl bases), or the first- and second-order (fs and fl bases) polarization functions.

Due to the size of the investigated systems and of the employed basis sets, the finite field approximation is used in the evaluation of the induced electric properties. Electric fields are located in the z and the x directions. In wide introductory tests performed in our earlier study [10], electric field strengths of 0.005 and 0.010 au were found to be optimal for the calculation of linear and non-linear electric properties of an isolated HCN molecule. Here, we use the same field strength values. Perpendicularly aligned electric fields are used in the calculation of the mixed tensorial components of the electric hyperpolarizabilities. The coupled cluster LPol-fl calculations of other than the axial components of the induced properties are found to be too demanding for the (HCN)4 complex.

All calculations have been carried out using the MOLCAS 6.5 and 7.4 package [28, 29]. In the following, only a limited number of the results is presented, those crucial for a comprehensive analysis and discussion of the results. The complete set of results is available in the Supporting Information.

3 Results and discussion

We start the discussion with the study of the results of the basis set dependence of the induced electric properties and the interaction energies in the investigated complexes. The results of the counterpoise-corrected CCSD(T) finite field calculations are presented in Table 2. It can be seen that the interaction-induced electric dipole moment values are stable in all four LPol-n (n = ds, fs, dl, fl) sets with differences smaller than 0.5%. The MP2/6-311++G(d,p) results of Rivelino et al. [18] for the n = 2, 3, 4 clusters are well bellow our MP2 values, maybe due to the basis set they use. And our values considerably improve those calculated by Adrian-Scotto and Vasilescu [19], both with respect to method and basis sets. For the HCN dimer, all our theoretical results agree well with the experimentally determined value (0.303 au [22]). Although one has to consider that the ab initio results refer to the studied linear configuration, and the experimental values bear the effects of the zero-point vibrations. For the trimer, our value is 0.022 au bellow the experimental value in Ref. [23].

Table 2 Counterpoise-corrected CCSD(T) finite field interaction-induced static electric dipole properties and interaction energies of the (HCN) m (m = 2–4) complexes

Any of the LPol-n (n = ds, fs, dl, fl) sets can also be used in reliable evaluations of the induced polarizability components and the zzz-component of the induced first hyperpolarizability. Changes in \(\Updelta \alpha_{\alpha\alpha}\) and \(\Updelta \beta_{zzz}\) values calculated with the different LPol-n sets are well below 1% for all investigated systems. Also the values of \(\Updelta \gamma_{zzzz}\) in the dimer and in the trimer are stable in all basis sets. In the tetramer, \(\Updelta \gamma_{zzzz}\) changes up to almost 5% with the increase in the basis set size. Other components of the induced first and second hyperpolarizabilities are subtler, and their reliable estimations are far more challenging. A basis set of at least LPol-fs quality is mandatory in reliable calculations of \(\Updelta \beta_{zxx}\). The estimation of the \(\Updelta \gamma_{xxxx}\) and \(\Updelta \gamma_{xxzz}\) values is more demanding.

In the case of the interaction energies, the ds and the dl bases give results that differ about 3% from the fs and fl basis set results. For the HCN dimer, the ds and the dl interaction energies are closer to the experimental value (potential well depth of −4.40 kcal/mol, Ref. [21]) than the fs and fl. This conclusion has to be taken with caution, considering the above-mentioned differences between the experimental and the theoretical geometries. The best previously available theoretical result [18], obtained with the CCSD(T) method and the aug-cc-pVTZ basis set, is closer to the fs and fl results than to the ds and dl. From Ref. [18], we can estimate CCSD(T) aug-cc-pVTZ interaction energies for the n = 3 and n = 4 complexes as −10.11 and −15.97 kcal/mol, respectively. We do this by correcting the MP2/6-311++G(d,p) results with the 6-311++G(d,p) CCSD(T) – MP2 and the CCSD(T) aug-cc-pVTZ – 6-311++G(d,p) differences obtained for the dimer, by adding them, and scaling the result by a factor of 2 in the case of the trimer, and of 3 for the tetramer.

We now turn our attention to the analysis of the order of magnitude of the interaction-induced effects. In this analysis, we refer to the LPol-fl results, and wherever those are not available, to the LPol-fs values. The induced dipole moment in the dimer is equal to 0.293 au. Addition of the next hydrogen cyanide molecule increases this value by 0.393 au, i.e. over 130%. This agrees well with the experimentally determined enhancement (0.405 au [22, 23]), but as pointed out above, a quantitative comparison to experiment is not straightforward. The induced dipole moment in the tetramer is equal to 1.121 au, about 3.8 times larger than that in the dimer. An analogous enhancement is observed for the zz-component of the induced electric dipole polarizability, the induced effect changes from 4.5 au in the dimer, through 10.5 au in the trimer, to 17.2 au in the tetramer. In the case of the xx-component of the induced polarizability, the addition of the third HCN molecule to the dimer causes an increase of the induced effect in the order of 120%. The xx-component of the induced polarizability in the tetramer is almost 3.5 times larger than that in the dimer.

The observed enhancement of the induced effects in the trimer and in the tetramer is even more pronounced in the case of \(\Updelta \beta_{zzz}\) and \(\Updelta \gamma_{zzzz}\). Addition of the third HCN molecule to the dimer increases the interaction-induced effects about 2.5 times. The induced effects in the tetramer are about 4.3 and 4.5 times larger than in the dimer for \(\Updelta \beta_{zzz}\) and \(\Updelta \gamma_{zzzz}\), respectively. \(\Updelta \beta_{zxx}\) in the trimer (tetramer) is approximately 1.8 (2.6) times larger than in the dimer.

The interaction energy grows with an increasing number of HCN molecules similarly to \(\Updelta \alpha_{xx}\), namely the interaction energy in the trimer is approximately 2.2 times larger than in the dimer, and in the tetramer approximately 3.5 times larger than in (HCN)2.

Table 3 reports the counterpoise-corrected MP2/LPol-fl and CCSD(T)/LPol-fl induced electric properties and interaction energies in the investigated complexes. The MP2 approximation systematically underestimates the induced electric properties and overestimates the interaction energy in the studied systems. The differences between the CCSD(T) and the MP2 results are in the order of 2.5% for \(\Updelta \mu\), 4% for \(\Updelta \alpha_{zz}\) and \(\Updelta E\), 3% for \(\Updelta \beta_{zzz}\) and over 5% for \(\Updelta \gamma_{zzzz}\). Considering this, we can conclude that the CCSD(T) method is necessary in order to get accurate interaction properties and energies. The agreement between the MP2 and the CCSD(T) results is much better in the case of \(\Updelta \alpha_{xx}\).

Table 3 Counterpoise-corrected MP2/LPol-fl and CCSD(T)/LPol-fl finite field interaction-induced static electric dipole properties and interaction energies of the (HCN) m (m = 2–4) complexes

In Table 4, we report the counterpoise-corrected and non-corrected CCSD(T)/LPol-fl results. The counterpoise correction of the results proves to be mandatory in the case of the interaction energies, where the non-corrected values are overestimated by over 20%. Such a large error is observed for all investigated basis sets and at all correlated levels of theory (see Table IX in the Supporting Information). For the CCSD(T)/LPol-fl interaction-induced electric properties, the BSSE proves to be well below 1%, and thus it can be safely neglected. A similar trend in the size of the BSSE is observed for the other LPol-n (n = ds, fs, dl) sets and all employed levels of theory (see the Supporting Information).

Table 4 Counterpoise-corrected (C) and non-corrected (NC) CCSD(T)/LPol-fl finite field interaction–induced static electric dipole properties and interaction energies of the (HCN) m (m = 2–4) complexes

4 Summary

We investigate the interaction energies and induced (linear and non-linear) electric dipole properties in the linear (HCN) m (m = 2–4) complexes. Calculations are performed within the HF SCF, MP2, CCSD and CCSD(T) levels of approximation using the LPol-n (n = ds, fs, dl, fl) sets. The electric properties are evaluated using the finite field method. The MP2 method is shown to systematically underestimate the induced electric properties and overestimate the interaction energies. The MP2 results differ from the CCSD(T) values by as much as 2.5% for the induced dipole moment, 4% for \(\Updelta \alpha_{zz}\) and \(\Updelta E\), 3% for \(\Updelta \beta_{zzz}\) and over 5% for \(\Updelta \gamma_{zzzz}\). Thus, in order to obtain accurate interaction energies and/or electric properties one needs to resort to the CCSD(T) level of approximation. The counterpoise correction proves to be mandatory in the evaluation of the interaction energies of the investigated systems, where the BSSE is in the order of 20%. The induced electric properties are far less sensitive to the BSSE, with the errors well below 1%. This will imply a considerable saving in computing time when evaluating interaction properties in larger complexes. A reliable estimation of the induced dipole moments, polarizabilities and first hyperpolarizabilities in linear hydrogen cyanide complexes can be done using the smallest among the LPol-n sets, the LPol-ds basis set, practically with no deterioration of the results. The interaction-induced second hyperpolarizabilities are more demanding, and basis sets of LPol-fs or LPol-fl quality are necessary for an accurate evaluation.