## Abstract

ᅟ

## Introduction

The increasing role of quantum chemical calculations in drug and materials design has led to a demand for methods that can describe the electronic structures of large and complex systems. Semiempirical methods based on the neglect of diatomic differential overlap (NDDO) approximation (e.g., the MNDO [1, 2], MNDO/d [3], AM1 [4], AM1* [5], and PMx [6–8] methods) are important representatives of such approaches. Many of these methods have been implemented in the massively parallel program EMPIRE [9], which makes the full quantum-mechanical treatment of systems containing 100,000 atoms or more possible.

Periodic boundary conditions (PBC) enable quantum chemical programs to treat condensed-phase systems, such as proteins in a periodic water box or solids. This allows molecular materials to be studied in their “native” environment, instead of comparing experimental bulk properties with gas-phase monomer calculations. For semiempirical methods, the most practical way of implementing PBC is the cyclic-cluster approach [10–12] in which the system is approximated by a supercell and by imposing Born–von Karman boundary conditions [13]. Using a large unit cell allows the calculation to be performed entirely in real space. This is easily affordable because of the generally low computational cost of NDDO calculations. The main advantage of this technique is that program features like the calculation of local properties [14] or excited states are directly transferable from nonperiodic calculations [15]. We have, for example, used periodic EMPIRE calculations to model amorphous carbon [16].

EMPIRE, which was especially designed for calculations on systems with very many atoms, is also suitable for use on systems with very large unit cells (e.g., disordered and amorphous systems). EMPIRE can, for example, be used in combination with a classical molecular dynamics (MD) code to perform electronic structure calculations on snapshots from an MD run on a periodic system. In the first section of this paper, we discuss the implementation of periodic boundary conditions in EMPIRE. In the second, the program performance is discussed briefly. Finally, some exemplary applications of large-scale periodic NDDO calculations are shown.

## Implementation

Periodic calculations in EMPIRE are performed entirely in real space. Therefore, no major changes to the NDDO SCF algorithm were required. Only small adjustments are necessary in the treatment of two-electron two-center integrals: the exchange energy and electrostatic interactions. These adjustments will be discussed below. For more background information, we refer the reader to [10, 12].

### Two-electron two-center integrals

The values of the two-electron two-center integrals* γ*
_{
AB
} used in NDDO calculations (and the associated potential) quickly decrease with the distance between the centers but remain nonzero. Since these small values add up to unphysical, infinite potentials in a periodic system, they must be corrected [12]. This is achieved by introducing a Gaussian damping function that sets in at a cutoff value *c*
_{cut} (the default is 10.0 bohrs). The functional form for these integrals is (in atomic units)

where *G*
_{
A
} and *G*
_{
B
} are parameterized constants for elements *A* and *B*, respectively, and *r* is the distance between the two centers.

### Exchange interaction energy

The next adjustment is required with respect to the two-electron two-center exchange integrals *h*
_{
μν
}, which appear in the Fock matrix. These terms depend on the density matrix elements *P*
_{
μν
}. In a periodic calculation, the exchange interactions for an orbital centered on a given atom are only evaluated within the Wigner–Seitz cell surrounding it. The net result is the neglect of very weak exchange interactions with distant electrons, which causes no loss in accuracy [10].

### Electrostatic interactions

MNDO-like NDDO methods describe electrostatic electron–electron and electron–core interactions using multipole–multipole interactions. In a periodic system, small interactions with an infinite number of distant charges lead to unphysical results. This can be alleviated by introducing a simple, one-parameter screening function.

Simply put, distant charges are relocated to an effective distance *r*
_{eff}, which is a function of the actual distance *r*. The space around a charge is divided into three regions, delimited by a parameter* α*: at close distances (*r* < *α*), the actual and effective distances are equal (*r*
_{eff} = *r*). At large distances (*r* > 2*α*), all charges are moved to a constant radius of 1.5*α*. In this manner, their effects cancel each other out due to symmetry [12]. In the intermediate region, the distance is scaled so as to satisfy the conditions

and

This scaling function [12] is defined as

MNDO-like methods treat charges as distributed multipoles, i.e., point charges at a defined distance to a center. To keep this screening scheme completely consistent, the positions of all individual point charges that make up distant multipoles would need to be scaled. This is undesirable because it would require calculating the distance between all atoms and all distributed multipole point charges. To avoid this, we introduce a second scaling function for the distance between the point charges of a multipole and the atom on which they are centered. At distances* r* < *α*, the multipoles are unaffected. At large distances* r* > 2*α*, all multipoles are reduced to point charges. In the intermediate region* α* <* r * < 2*α*, the multipoles are scaled by a factor* λ*(*r*), with the boundary conditions

and

The corresponding function is the derivative of (4):

The effect of scaling the multipole size can be evaluated by considering three different scenarios for an interaction between a point-charge and a dipole. The rigorous but unpractical solution is to scale the positions of the constituent point charges of the dipole individually, whereas the simplest approach would be to scale only the center of the dipole. Finally, we can scale the center of the dipole according to (4) and the distances of the distributed point charges to their center by (7).

Figure 1 shows the positions of the distributed monopoles for all three cases (*α* = 15.0 Å). Clearly, the scaling function* λ* is a practical way to describe the exact scaling of the multipole distance. This can also be shown by considering the Coulomb energy for the interaction between point charge and dipole. The dependence of the absolute error (i.e., the difference between the unscaled/scaled and exact cases) for this energy on the distance is also shown in Fig. 1. Below the cutoff value, all models are by definition equivalent. At the cutoff value, the scaled and unscaled errors are identical. At increasing distances, however, the error decreases for the scaled case and increases for the unscaled case.

## Performance

### Setting up periodic calculations

Periodic EMPIRE calculations require little more input than nonperiodic ones. Apart from the Cartesian coordinates of all atoms, a unit-cell vector is required for each periodic direction. Calculations can be periodic in one, two, or three dimensions. Since no k-space sampling is performed, the unit cell should be sufficiently large. The MOPAC manual suggests that 7–8 Å per repeat unit vector should be sufficient for most compounds, and larger unit cells should be used for highly conjugated π systems and small band-gap materials [17]. It is good practice to check the convergence of the calculated results with the size of the unit cell. Figure 2 shows the convergence of the calculated heat of formation with the unit-cell volume for diamond, ZnO, NaCl, and the adamantane molecular crystal. The convergence of the unit-cell size may differ for other properties, as Bredow et al. showed for excitation energies, where significantly larger cells were required than for the ground-state energy [15].

The electrostatic screening parameter can be modified via the keyword ScreeningR, which sets the value of 2*α* in Å. The default is 30.0 Å. This conservative cutoff corresponds to the MOPAC default. Lower cutoffs result in lower computational cost, but whether the heat of formation is affected by the change should be checked. The energy convergences and computational costs of different values of* α* are shown for diamond and ZnO in the “Electronic supplementary material” (ESM, Figs. S1 and S2).

### Single-node open MP scaling

Table 1 shows timings for AM1-SCF calculations of differently sized diamond and ZnO unit cells performed with the single-node OMP version of EMPIRE. The corresponding speedup for different numbers of cores is plotted in Fig. 3. The scaling is quite efficient; the speedup factor is >7 using eight cores for all systems investigated. The largest system considered here is the C_{512} unit cell, for which an SCF calculation takes less than 1 min on eight cores. This shows that on a modern desktop computer, periodic calculations with EMPIRE are absolutely affordable (see Table 2 and Fig. 4).

### Multi-node hybrid OMP/MPI scaling

The scaling of the hybrid OMP/MPI multi-node version of EMPIRE was tested on the LiMa cluster at the Regionales Rechenzentrum Erlangen. Here, we used differently sized diamond unit cells from C_{1,728} to C_{13,824}. Please note that very large unit cells also require large amounts of memory, especially because the integrals are stored, since their calculation is relatively expensive in periodic calculations. Therefore, it is not possible to use the same reference number of nodes when determining the scaling for these systems. The speedup is always relative to the lowest number of nodes feasible for a given system. Optimizing the SCF procedure for periodic calculations may improve the performance of EMPIRE on fewer nodes. As it is, the calculations scale very impressively up to twice the minimum number of nodes. Further increasing the number of nodes leads to a plateau.

## Application

The application of NDDO methods to crystalline materials has been thoroughly tested and evaluated by Stewart, and will therefore not be discussed here in any detail [18]. Instead, we would like to focus on two aspects unique to EMPIRE: firstly, the calculation of local properties; secondly, the fact that even unit cells with thousands of atoms can be treated easily.

### Local properties

A local property is any property that can be derived from the wavefunction of a structure and mapped onto a real-space grid, such as the electron density and the molecular electrostatic potential (MEP). These can be calculated with most electronic structure codes. EMPIRE (in combination with an auxiliary program) gives access to several additional local properties derived from molecular orbitals and their energies. These are the local electron affinity (EA_{L}), ionization energy (IE_{L}), electronegativity, and hardness, which have been used for biochemical QSPR studies and to predict the electron-transport properties of nanostructures [19–27].

Figure 5 shows the molecular electrostatic potentials of a pristine and a defective ZnO\( \left(10\overline{1}0\right) \) surface. The nonpolar \( \left(10\overline{1}0\right) \) surface consists of rows of ZnO dimers that are separated by trenches [28]. The most abundant atomic defects on this surface are ZnO dimer vacancies [29]. The entire geometry was re-optimized with the MNDO/d Hamiltonian for both the pristine and the defective surface. The removal of one ZnO dimer clearly affects the electrostatic potential around the defect, making it necessary to choose a large unit cell to avoid interactions of the defect with its periodic image. This calculation required around 1 min per optimization step and converged in 32 steps. The cell contains 766 atoms and 3,064 electrons. Note also that the MEP does not depend on simplifications such as point-charge models. The multipole formalism used in MNDO-like techniques gives a very good representation of the MEP calculated at higher levels of theory, including anisotropic distributions around heavy atoms [30].

A recent application of periodic local property maps lies in the study of charge transport in organic materials [25–27, 31]. In the condensed phase, the local ionization energy (IE_{L}) and electron affinity (EA_{L}) can be interpreted as the local valence-band maximum and conduction-band minimum, respectively. They can therefore be used to visualize the anisotropic electronic properties of a molecular crystal. More recently, the local properties have been used as external potentials to simulate charge transport (see [32]; Bauer T et al.,* A multi-agent quantum Monte Carlo model for charge transport: application to organic field-effect transistors*, submitted). Figure 6 shows the local ionization energy (IE_{L}) of a rubrene crystal projected onto volume slices that cut through the unit cell along its main axes.

Low IE_{L} values (shown in blue) correspond to electron-donating/hole-conducting pathways, whereas high IE_{L} values (shown in red) represent energy barriers. In Fig. 6, the IE_{L} maps look vastly different depending on the orientation of the volume slice. This is in line with experimental reports, which show that the field-effect mobilities in rubrene single crystals depend strongly on the orientation of the contacts [33, 34].

### Large unit cells

As an example of a large system, we chose the solvated lipid bilayer membrane shown in Fig. 7. Specifically, the model consists of 128 1,2-dilauroyl-*sn*-glycero-3-phosphocholine (DLPC) and 3,840 water molecules equilibrated for 400 ns in a classical molecular dynamics simulation [35]. The unit cell contains 25,088 atoms and spans 62.502 × 65.506 × 58.441 Å^{3}. An AM1-SCF calculation was performed on 384 cores of the LiMa cluster (64 MPI tasks on 32 nodes with 2 × 6 cores each). The SCF converged in 31 cycles and took a little over 3 h 7 min.

Note that periodic calculations of this size push double-precision (64-bit) arithmetic to its limit, since many small values are summed to a very large result during the energy summation. To avoid numerical inaccuracies for large systems, this step is performed in quadruple precision (128-bit), and special care is taken in the ordering of the summands.

The resulting HDF5 binary wavefunction file has a size of 21 GB and can be used to calculate local property maps. The molecular electrostatic potential across the membrane is shown in Fig. 7 (right). This clearly visualizes the polar water layer and head groups and the nonpolar lipid bilayer. Such calculations could, for instance, be used to predict the permeability of membranes to different chemicals.

Every EMPIRE calculation also includes a Coulson population analysis. Figure 8 shows a plot of the Coulson charges of all oxygen atoms as a function of their vertical position. In this plot, five charge groups are discernable, corresponding to the four chemically distinct oxygen atoms in DLPC and the oxygen atom in water. This presents an interesting perspective in the development of force fields for condensed-phase applications, since the charges can be derived directly for the solid or liquid of interest.

## Conclusions

We have implemented periodic boundary conditions in the massively parallel semiempirical molecular orbital theory code EMPIRE. The standard SCF procedure of EMPIRE reliably converges the wavefunctions of a broad range of periodic systems, including covalent, ionic, and molecular crystals and surfaces as well as disordered biological systems such as a lipid bilayer. Like the nonperiodic version of EMPIRE, the program is parallelized in the single-node version via open MP, and in the multi-node version via a hybrid open MP/MPI approach.

The single-node version was shown to perform well for calculations on unit cells containing between 64 and 512 atoms, and to scale very efficiently on up to eight cores. The multi-node version allows systems with tens of thousands of atoms to be treated; the largest system described here consisted of 25,088 atoms. The program scaling is similar to that observed for nonperiodic calculations with EMPIRE [9].

## References

- 1.
Dewar MJS, Thiel W (1977) Ground states of molecules, 38. The MNDO method. Approximations and parameters. J Am Chem Soc 99:4899. doi:10.1021/ja00457a004

- 2.
Dewar MJS, Thiel W (1977) Ground states of molecules, 39. MNDO results for molecules containing hydrogen, carbon, nitrogen, and oxygen. J Am Chem Soc 99:4907. doi:10.1021/ja00457a005

- 3.
Thiel W, Voityuk A (1996) Extension of MNDO to

*d*-orbitals—parameters and results for the second-row elements and for the zinc group. J Phys Chem 100:616. doi:10.1021/jp952148o - 4.
Dewar MJS, Zoebisch EG, Healy EF, Stewart JJP (1985) AM1: a new general purpose quantum mechanical model. J Am Chem Soc 107:3902. doi:10.1021/ja00299a024

- 5.
Winget P, Horn AC, Selçuki C, Martin B, Clark T (2003) AM1* parameters for phosphorus, sulfur and chlorine. J Mol Model 9:408. doi:10.1007/s00894-003-0156-7

- 6.
Stewart JJP (1989) Optimization of parameters for semi-empirical methods I: method. J Comput Chem 10:209–220. doi:10.1002/jcc.540100208

- 7.
Stewart JJP (2007) Optimization of parameters for semiempirical methods V: modification of NDDO approximations and application to 70 elements. J Mol Model 13:1173. doi:10.1007/s00894-007-0233-4

- 8.
Stewart JJP (2013) Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters. J Mol Model 19:1–32. doi:10.1007/s00894-012-1667-x

- 9.
Hennemann M, Clark T (2014) EMPIRE: a highly parallel semiempirical molecular orbital program: 1: self-consistent field calculations. J Mol Model 20:2331. doi:10.1007/s00894-014-2331-4

- 10.
Perkins PG, Stewart JJP (1980) Cluster model for solids. J Chem Soc Faraday Trans 2 76:520. doi: 10.1039/F29807600520

- 11.
Bredow T, Geudtner G, Jug K (2001) Development of the cyclic cluster approach for ionic systems. J Comput Chem 22:89–101. doi:10.1002/1096-987X(20010115)22:1<89::AID-JCC9>3.0.CO;2-7

- 12.
Stewart JJP (2000) A practical method for modeling solids using semiempirical methods. J Mol Struct 556:59–67. doi:10.1016/S0022-2860(00)00651-7

- 13.
Born M, Kármán T (1912) Über Schwingungen in Raumgittern. Phys Z 13:297–309

- 14.
Ehresmann B, Martin B, Horn AHC, Clark T (2003) Local molecular properties and their use in predicting reactivity. J Mol Model 9:342–347. doi:10.1007/s00894-003-0153-x

- 15.
Gadaczek I, Hintze KJ, Bredow T (2012) Periodic calculations of excited states for solids using a semiempirical approach. Phys Chem Chem Phys 14:741–750. doi:10.1039/c1cp22871d

- 16.
Margraf JT, Strauss V, Guldi DM, Clark T (2015) The electronic structure of amorphous carbon nanodots. J Phys Chem B 119:ASAP. doi:10.1021/jp510620j

- 17.
Stewart JJP (2015) MOPAC online manual. Stewart Computational Chemistry, Colorado Springs. http://openmopac.net/manual/Solids_cluster.html. Accessed 5 April 2015

- 18.
Stewart JJP (2008) Application of the PM6 method to modeling the solid state. J Mol Model 14:499–535. doi:10.1007/s00894-008-0299-7

- 19.
Ehresmann B, de Groot MJ, Alex A, Clark T (2004) New molecular descriptors based on local properties at the molecular surface and a boiling-point model derived from them. J Chem Inf Comput Sci 44:658–668. doi:10.1021/ci034215e

- 20.
Güssregen A, Matter H, Hessler G, Müller M, Schmidt F, Clark T (2012) 3D-QSAR based on quantum-chemical molecular fields: towards an improved description of halogen interactions. J Chem Inf Model 52:2441–2453. doi:10.1021/ci300253z

- 21.
Ehresmann B, de Groot MJ, Clark T (2005) Surface-integral QSPR models: local energy properties. J Chem Inf Model 45:1053–1060. doi:10.1021/ci050025n

- 22.
Clark T (2010) The local electron affinity for non-minimal basis sets. J Mol Model 16:1231–1238. doi:10.1007/s00894-009-0607-x

- 23.
ElKerdawy A, Wick CR, Hennemann M, Clark T (2012) Predicting the sites and energies of noncovalent intermolecular interactions using local properties. J Chem Inf Model 52:1061–1071. doi:10.1021/ci300095x

- 24.
Clark T, Halik M, Hennemann M, Jäger CM (2013) Simulating “soft” electronic devices. In: Hicks MG, Kettner C (eds) Molecular engineering and control. Logos, Berlin, pp 137–150

- 25.
Etschel S, Waterloo A, Margraf JT, Amin AY, Hampel F, Jäger CM, Clark T, Halik M, Tykwinski RR (2013) An unsymmetrical pentacene derivative with ambipolar behavior in organic thin-film transistors. Chem Commun 49:6725–6727. doi:10.1039/C3CC43270J

- 26.
Jäger CM, Schmaltz T, Novak M, Khassanov A, Vorobiev A, Hennemann M, Krause A, Dietrich H, Zahn D, Hirsch A, Halik M, Clark T (2013) Improving the charge transport in self-assembled monolayer field-effect transistors—from theory to devices. J Am Chem Soc 135:4893–4900. doi:10.1021/ja401320n

- 27.
Schubert C, Margraf JT, Clark T, Guldi DM (2014) Molecular wires—impact of π-conjugation and implementation of molecular bottlenecks. Chem Soc Rev 44:988–998. doi:10.1039/c4cs00262h

- 28.
Meyer B, Marx D (2003) Density-functional study of the structure and stability of ZnO surfaces. Phys Rev B 67:035403. doi:10.1103/PhysRevB.67.035403

- 29.
Kovacik R, Meyer B, Marx D (2007) F centers versus dimer vacancies on ZnO surfaces: characterization by STM and STS calculations. Angew Chem Int Ed 46:4894–4897. doi:10.1002/anie.200604399

- 30.
Horn AHC, Lin J-H, Clark T (2005) Multipole electrostatic model for MNDO-like techniques with minimal valence

*spd*-basis sets. Theor Chem Acc 114:159–168; erratum: (2007) 117:461–465 - 31.
Atienza C, Martin N, Wielepolski M, Haworth N, Clark T, Guldi DM (2006) Tuning electron transfer through

*p*-phenyleneethylene molecular wires. Chem Commun 30:3202–3204. doi: 10.1039/B603149H - 32.
Bauer T (2015) Multi-Agenten-Simulation organischer Feldeffekttransistoren. Ph.D. thesis. Universität Erlangen-Nürnberg, Erlangen

- 33.
Ling MM, Reese C, Briseno AL, Bao Z (2007) Non-destructive probing of the anisotropy of field-effect mobility in the rubrene single crystal. Synth Met 157:257–260. doi:10.1016/j.synthmet.2007.02.004

- 34.
Reese C, Bao Z (2007) High-resolution measurement of the anisotropy of charge transport in single crystals. Adv Mater 19:4535–4538. doi:10.1002/adma.200701139

- 35.
Jämbeck JPM, Lyubartsev AP (2012) Derivation and systematic validation of a refined all-atom force field for phosphatidylcholine lipids. J Phys Chem B 116:3164–3179. doi:10.1021/jp212503e

## Acknowledgments

This work was supported by the Deutsche Forschungsgemeinschaft as part of SFB 953 “Synthetic Carbon Allotropes” and EXC 315 “Engineering of Advanced Materials”, and by the Bavarian Government within the “Solar Technologies go Hybrid” (SolTech) initiative. JTM was supported by a Beilstein Fellowship.

## Author information

### Affiliations

### Corresponding author

## Electronic supplementary material

Below is the link to the electronic supplementary material.

### ESM 1

(DOCX 90 kb)

## Rights and permissions

**Open Access ** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Margraf, J.T., Hennemann, M., Meyer, B. *et al.* EMPIRE: a highly parallel semiempirical molecular orbital program: 2: periodic boundary conditions.
*J Mol Model* **21, **144 (2015). https://doi.org/10.1007/s00894-015-2692-3

Received:

Accepted:

Published:

### Keywords

- Point Charge
- Molecular Electrostatic Potential
- Rubrene
- Classical Molecular Dynamic
- Periodic Calculation