EMPIRE: a highly parallel semiempirical molecular orbital program: 2: periodic boundary conditions

ᅟ Graphical Abstract ᅟ Electronic supplementary material The online version of this article (doi:10.1007/s00894-015-2692-3) contains supplementary material, which is available to authorized users.


Introduction
The increasing role of quantum chemical calculations in drug and materials design has led to a demand for methods that can describe the electronic structures of large and complex systems. Semiempirical methods based on the neglect of diatomic differential overlap (NDDO) approximation (e.g., the MNDO [1,2], MNDO/d [3], AM1 [4], AM1* [5], and PMx [6][7][8] methods) are important representatives of such approaches. Many of these methods have been implemented in the massively parallel program EMPIRE [9], which makes the full quantum-mechanical treatment of systems containing 100,000 atoms or more possible.
Periodic boundary conditions (PBC) enable quantum chemical programs to treat condensed-phase systems, such as proteins in a periodic water box or solids. This allows molecular materials to be studied in their Bnative^environment, instead of comparing experimental bulk properties with gas-phase monomer calculations. For semiempirical methods, the most practical way of implementing PBC is the cycliccluster approach [10][11][12] in which the system is approximated by a supercell and by imposing Born-von Karman boundary conditions [13]. Using a large unit cell allows the calculation to be performed entirely in real space. This is easily affordable because of the generally low computational cost of NDDO calculations. The main advantage of this technique is that program features like the calculation of local properties [14] or excited states are directly transferable from nonperiodic calculations [15]. We have, for example, used periodic EMPIRE calculations to model amorphous carbon [16].
EMPIRE, which was especially designed for calculations on systems with very many atoms, is also suitable for use on systems with very large unit cells (e.g., disordered and amorphous systems). EMPIRE can, for example, be used in combination with a classical molecular dynamics (MD) code to perform electronic structure calculations on snapshots from an MD run on a periodic system. In the first section of this paper, we discuss the implementation of periodic boundary conditions in EMPIRE. In the second, the program performance is discussed briefly. Finally, some exemplary applications of large-scale periodic NDDO calculations are shown.

Implementation
Periodic calculations in EMPIRE are performed entirely in real space. Therefore, no major changes to the NDDO SCF algorithm were required. Only small adjustments are necessary in the treatment of two-electron two-center integrals: the exchange energy and electrostatic interactions. These adjustments will be discussed below. For more background information, we refer the reader to [10,12].

Two-electron two-center integrals
The values of the two-electron two-center integrals γ AB used in NDDO calculations (and the associated potential) quickly decrease with the distance between the centers but remain nonzero. Since these small values add up to unphysical, infinite potentials in a periodic system, they must be corrected [12]. This is achieved by introducing a Gaussian damping function that sets in at a cutoff value c cut (the default is 10.0 bohrs). The functional form for these integrals is (in atomic units) where G A and G B are parameterized constants for elements A and B, respectively, and r is the distance between the two centers.

Exchange interaction energy
The next adjustment is required with respect to the twoelectron two-center exchange integrals h μν , which appear in the Fock matrix. These terms depend on the density matrix elements P μν . In a periodic calculation, the exchange interactions for an orbital centered on a given atom are only evaluated within the Wigner-Seitz cell surrounding it. The net result is the neglect of very weak exchange interactions with distant electrons, which causes no loss in accuracy [10].

Electrostatic interactions
MNDO-like NDDO methods describe electrostatic electronelectron and electron-core interactions using multipole-multipole interactions. In a periodic system, small interactions with an infinite number of distant charges lead to unphysical results. This can be alleviated by introducing a simple, oneparameter screening function. Simply put, distant charges are relocated to an effective distance r eff , which is a function of the actual distance r. The space around a charge is divided into three regions, delimited by a parameter α: at close distances (r<α), the actual and effective distances are equal (r eff = r). At large distances (r>2α), all charges are moved to a constant radius of 1.5α. In this manner, their effects cancel each other out due to symmetry [12]. In the intermediate region, the distance is scaled so as to satisfy the conditions and This scaling function [12] is defined as MNDO-like methods treat charges as distributed multipoles, i.e., point charges at a defined distance to a center. To keep this screening scheme completely consistent, the positions of all individual point charges that make up distant multipoles would need to be scaled. This is undesirable because it would require calculating the distance between all atoms and all distributed multipole point charges. To avoid this, we introduce a second scaling function for the distance between the point charges of a multipole and the atom on which they are centered. At distances r<α, the multipoles are unaffected. At large distances r>2α, all multipoles are reduced to point charges. In the intermediate region α<r <2α, the multipoles are scaled by a factor λ(r), with the boundary conditions The corresponding function is the derivative of (4): The effect of scaling the multipole size can be evaluated by considering three different scenarios for an interaction between a point-charge and a dipole. The rigorous but unpractical solution is to scale the positions of the constituent point charges of the dipole individually, whereas the simplest approach would be to scale only the center of the dipole. Finally, we can scale the center of the dipole according to (4) and the distances of the distributed point charges to their center by (7). Figure 1 shows the positions of the distributed monopoles for all three cases (α=15.0 Å). Clearly, the scaling function λ is a practical way to describe the exact scaling of the multipole distance. This can also be shown by considering the Coulomb energy for the interaction between point charge and dipole. The dependence of the absolute error (i.e., the difference between the unscaled/scaled and exact cases) for this energy on the distance is also shown in Fig. 1. Below the cutoff value, all models are by definition equivalent. At the cutoff value, the scaled and unscaled errors are identical. At increasing distances, however, the error decreases for the scaled case and increases for the unscaled case.

Performance Setting up periodic calculations
Periodic EMPIRE calculations require little more input than nonperiodic ones. Apart from the Cartesian coordinates of all atoms, a unit-cell vector is required for each periodic direction. Calculations can be periodic in one, two, or three dimensions. Since no k-space sampling is performed, the unit cell should be sufficiently large. The MOPAC manual suggests that 7-8 Å per repeat unit vector should be sufficient for most compounds, and larger unit cells should be used for highly conjugated π systems and small band-gap materials [17]. It is good practice to check the convergence of the calculated results with the size of the unit cell. Figure 2 shows the convergence of the calculated heat of formation with the unit-cell volume for diamond, ZnO, NaCl, and the adamantane molecular crystal. The convergence of the unit-cell size may differ for other properties, as Bredow et al. showed for excitation energies, where significantly larger cells were required than for the ground-state energy [15].
The electrostatic screening parameter can be modified via the keyword ScreeningR, which sets the value of 2α in Å. The default is 30.0 Å. This conservative cutoff corresponds to the MOPAC default. Lower cutoffs result in lower computational cost, but whether the heat of formation is affected by the change should be checked. The energy convergences and computational costs of different values of α are shown for diamond and ZnO in the BElectronic supplementary material( ESM, Figs. S1 and S2).
Single-node open MP scaling Table 1 shows timings for AM1-SCF calculations of differently sized diamond and ZnO unit cells performed with the single-node OMP version of EMPIRE. The corresponding speedup for different numbers of cores is plotted in Fig. 3. The scaling is quite efficient; the speedup factor is >7 using eight cores for all systems investigated. The largest system considered here is the C 512 unit cell, for which an SCF  calculation takes less than 1 min on eight cores. This shows that on a modern desktop computer, periodic calculations with EMPIRE are absolutely affordable (see Table 2 and Fig. 4).

Multi-node hybrid OMP/MPI scaling
The scaling of the hybrid OMP/MPI multi-node version of EMPIRE was tested on the LiMa cluster at the Regionales Rechenzentrum Erlangen. Here, we used differently sized diamond unit cells from C 1,728 to C 13,824 . Please note that very large unit cells also require large amounts of memory, especially because the integrals are stored, since their calculation is relatively expensive in periodic calculations. Therefore, it is not possible to use the same reference number of nodes when determining the scaling for these systems. The speedup is always relative to the lowest number of nodes feasible for a given system. Optimizing the SCF procedure for periodic calculations may improve the performance of EMPIRE on fewer nodes. As it is, the calculations scale very impressively up to twice the minimum number of nodes. Further increasing the number of nodes leads to a plateau.

Application
The application of NDDO methods to crystalline materials has been thoroughly tested and evaluated by Stewart, and will therefore not be discussed here in any detail [18]. Instead, we would like to focus on two aspects unique to EMPIRE: firstly, the calculation of local properties; secondly, the fact that even unit cells with thousands of atoms can be treated easily.

Local properties
A local property is any property that can be derived from the wavefunction of a structure and mapped onto a real-space grid, such as the electron density and the molecular electrostatic potential (MEP). These can be calculated with most electronic structure codes. EMPIRE (in combination with an auxiliary program) gives access to several additional local properties derived from molecular orbitals and their energies. These are the local electron affinity (EA L ), ionization energy (IE L ), electronegativity, and hardness, which have been used for biochemical QSPR studies and to predict the electrontransport properties of nanostructures [19][20][21][22][23][24][25][26][27]. Figure 5 shows the molecular electrostatic potentials of a pristine and a defective ZnO 1010Þ ð surface. The nonpolar 1010Þ ð surface consists of rows of ZnO dimers that are separated by trenches [28]. The most abundant atomic defects on this surface are ZnO dimer vacancies [29]. The entire geometry was re-optimized with the MNDO/d Hamiltonian for both the pristine and the defective surface. The removal of one ZnO dimer clearly affects the electrostatic potential around the  Each node was equipped with two six-core Intel ® Xeon ® 5650 BWestmere^chips; the nodes were connected by an Infiniband interconnect fabric with 40 Gbit/s bandwith per link and direction. We used two MPI tasks per node and six OMP threads for each. No hyperthreading was used defect, making it necessary to choose a large unit cell to avoid interactions of the defect with its periodic image. This calculation required around 1 min per optimization step and converged in 32 steps. The cell contains 766 atoms and 3,064 electrons. Note also that the MEP does not depend on simplifications such as point-charge models. The multipole formalism used in MNDO-like techniques gives a very good representation of the MEP calculated at higher levels of theory, including anisotropic distributions around heavy atoms [30]. A recent application of periodic local property maps lies in the study of charge transport in organic materials [25][26][27]31]. In the condensed phase, the local ionization energy (IE L ) and electron affinity (EA L ) can be interpreted as the local valenceband maximum and conduction-band minimum, respectively. They can therefore be used to visualize the anisotropic electronic properties of a molecular crystal. More recently, the local properties have been used as external potentials to simulate charge transport (see [32]; Bauer T et al., A multi-agent quantum Monte Carlo model for charge transport: application to organic field-effect transistors, submitted). Figure 6 shows the local ionization energy (IE L ) of a rubrene crystal projected onto volume slices that cut through the unit cell along its main axes.
Low IE L values (shown in blue) correspond to electrondonating/hole-conducting pathways, whereas high IE L values (shown in red) represent energy barriers. In Fig. 6, the IE L maps look vastly different depending on the orientation of the volume slice. This is in line with experimental reports, which show that the field-effect mobilities in rubrene single crystals depend strongly on the orientation of the contacts [33,34].

Large unit cells
As an example of a large system, we chose the solvated lipid bilayer membrane shown in Fig. 7. Specifically, the model consists of 128 1,2-dilauroyl-sn-glycero-3-phosphocholine (DLPC) and 3,840 water molecules equilibrated for 400 ns in a classical molecular dynamics simulation [35]. The unit cell contains 25,088 atoms and spans 62.502 × 65.506 × 58.441 Å 3 . An AM1-SCF calculation was performed on 384 cores of the LiMa cluster (64 MPI tasks on 32 nodes with 2×6 cores each). The SCF converged in 31 cycles and took a little over 3 h 7 min.
Note that periodic calculations of this size push doubleprecision (64-bit) arithmetic to its limit, since many small values are summed to a very large result during the energy summation. To avoid numerical inaccuracies for large systems, this step is performed in quadruple precision (128-bit), and special care is taken in the ordering of the summands. The resulting HDF5 binary wavefunction file has a size of 21 GB and can be used to calculate local property maps. The molecular electrostatic potential across the membrane is shown in Fig. 7 (right). This clearly visualizes the polar water layer and head groups and the nonpolar lipid bilayer. Such calculations could, for instance, be used to predict the permeability of membranes to different chemicals.
Every EMPIRE calculation also includes a Coulson population analysis. Figure 8 shows a plot of the Coulson charges of all oxygen atoms as a function of their vertical position. In this plot, five charge groups are discernable, corresponding to the four chemically distinct oxygen atoms in DLPC and the oxygen atom in water. This presents an interesting perspective in the development of force fields for condensed-phase applications, since the charges can be derived directly for the solid or liquid of interest.

Conclusions
We have implemented periodic boundary conditions in the massively parallel semiempirical molecular orbital theory code EMPIRE. The standard SCF procedure of EMPIRE reliably converges the wavefunctions of a broad range of periodic systems, including covalent, ionic, and molecular crystals and surfaces as well as disordered biological systems such as a lipid bilayer. Like the nonperiodic version of EMPIRE, the program is parallelized in the single-node version via open MP, and in the multi-node version via a hybrid open MP/MPI approach.
The single-node version was shown to perform well for calculations on unit cells containing between 64 and 512 atoms, and to scale very efficiently on up to eight cores. The multi-node version allows systems with tens of thousands of atoms to be treated; the largest system described here consisted of 25,088 atoms. The program scaling is similar to that observed for nonperiodic calculations with EMPIRE [9].