Adaptive resolution simulation of an atomistic DNA molecule in MARTINI salt solution

We present a dual-resolution model of a deoxyribonucleic acid (DNA) molecule in a bathing solution, where we concurrently couple atomistic bundled water and ions with the coarse-grained MARTINI model of the solvent. We use our fine-grained salt solution model as a solvent in the inner shell surrounding the DNA molecule, whereas the solvent in the outer shell is modeled by the coarse-grained model. The solvent entities can exchange between the two domains and adapt their resolution accordingly. We critically asses the performance of our multiscale model in adaptive resolution simulations of an infinitely long DNA molecule, focusing on the structural characteristics of the solvent around DNA. Our analysis shows that the adaptive resolution scheme does not produce any noticeable artifacts in comparison to a reference system simulated in full detail. The effect of using a bundled-SPC model, required for multiscaling, compared to the standard free SPC model is also evaluated. Our multiscale approach opens the way for large scale applications of DNA and other biomolecules which require a large solvent reservoir to avoid boundary effects.


Introduction
All-atom molecular dynamics (MD) is a powerful simulation tool and can provide detailed insight into the structural and dynamical properties of biomolecular systems [1].Unfortunately, performing these simulations remains challenging even for the state-of-the-art computational platforms due to the inability to cover all spatiotemporal scales of the biomolecular phenomena.Accessing large length and 1596 The European Physical Journal Special Topics time scales thus inevitably requires simplifications of the model, i.e., the model needs to be coarse-grained [2][3][4].
In biomolecular simulations this high computational demand is often associated with realistic modeling of the solvent.Water, the most abundant solvent in nature, is essential for proper functioning and stability of (bio)macromolecules.However, it typically comprises most of the particles in the simulated system and thus the majority of the computational effort is spent on obtaining water-water interactions in distal regions that are not relevant for the problem under consideration.This realization led to the development of several multiscale simulation methods, which reduce the number of degrees of freedom for distal water and at the same time keep the atomistic (AT) resolution where it is necessary [5][6][7][8][9].The speedup of such multiscale simulations with respect to all-atom simulations is proportional to the reduction of the interaction sites in the coarse-grained (CG) model of water.Consequently, from the viewpoint of computational efficiency it is advantageous to use a supramolecular CG model [10][11][12][13], where several water molecules are represented with a single effective bead.A widely used CG model of this class is the MARTINI force field [14,15].
Previously we have already successfully performed multiscale coupling of AT and MARTINI models in the case of an atomistic protein embedded in supramolecular CG water [16] using the adaptive resolution scheme (AdResS) [17][18][19][20][21][22][23].To perform efficient mapping between the AT and supramolecular CG water models, we made use of AT bundled-SPC water models, where the molecules mapped to the same CG bead are restrained by distance-dependent potentials [24,25].The introduction of additional constraints to some extent alters the properties of the model compared to the unrestrained model.The systematic evaluation of the bundled-SPC water by Gopal et al. [26] has shown that the model can be used as a solvent for various biomolecular systems because many thermodynamic properties of the bundled-SPC models are in agreement with the unrestrained SPC and experiments.However, differences between the water models were found for the structure of a coiled-coil dimer and hydration of the active site of a serine protease.
In this work, we focus on double-stranded deoxyribonucleic acid (DNA).A wide range of CG models already exist for DNA [27][28][29][30][31][32][33][34][35][36][37], but here we keep the DNA molecule at AT resolution embedded in a multiresolution supramolecular bundled-SPC/MARTINI univalent salt solution.The simulation setup is similar to the one we used in our previous multiscale simulation of the dielectric properties of the solvated DNA molecule [38].The novelty here is the supramolecular salt solvation using a 4-to-1 molecular mapping for water.Consequently, the DNA molecule is modeled with a different force field, i.e., GROMOS [39], which is used in conjunction with SPC water model, instead of AMBER [40].Furthermore, in the present study our analysis of the system is focused on other physical quantities of interest, not addressed and analyzed in our previous simulation [38].Thus we focus our attention on the structural properties of the solvent around the DNA to determine the effect of bundling.The structural features are characterized by various orientational and geometric order parameters [41][42][43][44][45], density profiles, deformations of the bundle's internal structure, and hydrogen-bonding of water with the DNA.

Multiscale model
We simulate a B-form DNA molecule with a periodic 10 base-pair sequence (5 -CTCTCGCGCG-3 ).To mimic the natural DNA environment we employ a 1 M NaCl salt solution as a solvent and add extra Na + counterions for overall charge neutrality.We model the DNA molecule always at the full AT resolution using the GROMOS 54a7 [39] force field, while the solvent is modeled at multiple resolution levels.Figure 1 shows a pictorial representation of the simulated system with a split view of high, intermediate, and low resolution level domains for better clarity.The level of solvent modeling depends on the distance from the DNA's centerof-mass (CoM).At short distances we resort to the full AT resolution, whereas at larger distances we employ a CG salt solution model.For computational efficiency it is advantageous to choose the geometrical shape of the resolution regions in line with the shape of the simulated macromolecule.In this work, we thus opted for cylindrical boundaries.To ensure that the DNA and the surrounding layer of proximal solvent are always expressed in full AT detail, we set the center of the AT cylinder to match the DNA's CoM at all times.The AT cylinder radius is set to 1.8 nm (about 6 water molecules), whereas the width of the HY region is 1.2 nm (about 4 water molecules).The whole simulation box contains 7212 water molecules (≈ 13% are in the AT region).This setup is motivated by our previous multiscale simulation of the DNA molecule [38], where the selected size of the AT domain was found to suffice.
The solvent in the low resolution distal regime is modeled with the standard MARTINI force field [14], where a single CG bead represents a bundle of four water molecules.According to their current position, the water molecules adapt their resolution from one CG bead to four atomistically resolved molecules and vice versa on-the-fly.To facilitate the supramolecular (in the present case 4-to-1) coupling, where multiple molecules are mapped always to the same CG bead, we need to restrict the relative motion of these molecules; this means that the molecules cannot diffuse far away from each other, thus ensuring a meaningful correspondence between AT and CG coordinates.Accordingly we model the water in the high resolution region with the bundled-SPC [24,25] water model, where an attractive semi-harmonic potential is added between all oxygen atoms within a bundle.In addition, the oxygen-oxygen Lennard-Jones interaction is modified to match the density of the SPC water.We use parameters that correspond to model 1 in Ref. [24], which outperforms the alternative bundled-SPC model 2 [26].For the ions we employ the GROMOS [39] and MARTINI [14] force fields in the AT and CG regions, respectively.Note that the ions in both regions are represented as one site particles, i.e., the level of resolution does not change, but the interactions do (the electrostatic interactions in the CG region are screened, see Sect.3).

Methods and computational details
The multiscale MD simulations are performed with the AdResS method [17][18][19][20][46][47][48].The total force acting on an entity (i.e., a bundle or an ion) α is given by where F at αβ and F cg αβ are the forces between entities α and β, obtained from the AT and CG potentials, respectively.The R α , R β and R are two-dimensional (x, y) vectors of CoMs of entities α and β, and the DNA, respectively.A smooth transition from high and low resolution regimes via hybrid (HY) region is enabled by the sigmoidal function w.It is equal to 1 and 0 in the AT and CG regions, respectively.Due to the chemical potential inequality of the high and low resolution models, the uniform density profile has to be imposed with an external force.To this end, we apply a thermodynamic (TD) force F T D α [19,49] that is defined as a negative gradient of the effective excess chemical potential [50].In practice, however, we use an iterative formula where C is an appropriately chosen numerical prefactor [19,49].The force acts on the CoM in the HY region and depends on the type of the entity.Therefore, we calculate three different forces that correspond to water bundles and Na + and Cl − ions [51].
The simulation setup and protocol are similar as in our previous work [38].For completeness we briefly summarize them here.Simulations are performed with the ESPResSo++ software package [52].We use the standard velocity Verlet algorithm with a time step of 1 fs.A local Langevin thermostat with a friction constant of 5.0 ps −1 is applied to maintain the temperature of 300 K.The hydrogen atoms of the DNA molecule are constrained with the RATTLE [53] algorithm, whereas the geometry of water molecules is constrained with SETTLE [54].The nonbonded cutoff distance is set to r c = 1.2 nm.The electrostatic interactions beyond the cutoff are approximated with the generalized reaction field method [55,56].The dielectric permittivity of the inner region, that is, within cutoff distance is equal to 1 and 15 for the AT and CG regions, respectively.The dielectric permittivity of the outer region is equal to 80 in both regions.The inverse Debye screening length κ = 3.25 nm −1 is set to correspond to a 1 M salt solution.We use an orthorhombic simulation box with dimensions 8.5 nm × 8.5 nm × 3.4 nm, periodic boundary conditions, and minimum image convention.The periodic boundary condition is employed also for the DNA molecule, i.e., each strand is connected to its periodic image along the z-axis by additional intramolecular DNA interactions defined by bond, angle, and dihedral interaction potentials.In addition, the initial coordinates of the DNA molecule are generic (obtained with the 3D-DART webserver [57]) to avoid any strains due to periodicity.With imposed periodicity we are effectively simulating an infinitely long DNA molecule, where the helix is not allowed to perform bending fluctuations [38].Production runs are 10 and 25 ns for the salt and DNA simulations, respectively.In all cases the length of the equilibration is 1 ns.  2. Normalized density profiles (NDP) with standard deviations of bulk solvent system for bundle CoM (red), sodium (green), and chloride (blue).The results are shown for the conventional all-atom simulation and AdResS simulations with the AT region radius size of 2.1 nm.The transitions between resolution regions are marked with the vertical dotted lines.For comparison, the results from an additional AdResS simulation are shown where the TD force is added only to water bundles.The bottom plot shows the TD forces applied to all three molecule types.

Results and discussion
In this section, we first develop a multiscale bundled-SPC/MARTINI NaCl salt solution model by calculating the appropriate TD forces.Then we employ this multiscale salt solution model to simulate a single DNA molecule and perform the statistical analysis of the simulated system.The focal point of our analysis is the local structure of the aqueous solvent surrounding the DNA, which we describe with the local density, orientational and geometrical order parameters, and hydrogen-bonding statistics.For validating and comparing the statistical properties of our multiscale simulation (labeled AdResS), we perform also simulations where the AT region extends across the whole simulation box (labeled all-atom).To determine the effect of the bundling, we compare the multiscale simulation with a fully atomistic simulation, where the original SPC water model (labeled free SPC) is used instead of the bundled-SPC one.

Bundled-SPC/MARTINI NaCl salt solution
In the AdResS scheme, TD forces are applied to compensate the difference in the chemical potential at different levels of resolution and to achieve a uniform density profile throughout the simulation box.We iterate the TD forces in the pure solvent conditions, i.e., in the absence of DNA [38].Converged TD forces for the bundle CoM, Na + , and Cl − are shown in Fig. 2. Employing the TD forces, the observed normalized density profiles (NDPs) of the solvent entities are flat, as shown in Fig. 2 3. Radial distribution functions (RDFs) of sodium-oxygen, chloride-oxygen, sodiumsodium, chloride-chloride, and sodium-chloride for the bulk solvent system.The AdResS RDFs match well the reference all-atom and coarse-grained simulation results.Labels "AdResS AT" and "AdResS CG" denote the AdResS domain, i.e. the atomistic and coarsegrained regions, respectively.plots include also the standard deviations of the average values, which are quite large when those for ions are compared to those for water bundles.To ascertain that this result is due to small number of ions, we plot also the NDPs of an all-atom simulation.The deviations from the ideally flat profile are then in both cases of the same magnitude.
In accordance with previous studies [51,58] we find that smoother NDPs are achieved when the TD forces are slightly extended into the AT and CG regions.Here, we apply the TD force in the range r AT −r skin < r < r HY +r skin , where r skin = 0.3 nm is the extension and r AT = 2.1 nm and r HY = 3.3 nm are the radii of the AT and HY domains, respectively.The TD force for bundles can be iterated independently from the TD forces of ions.This can be observed from the NDPs of AdResS simulation where the TD force is acting only on bundles.In contrast, both Na + and Cl − TD forces have to be iterated simultaneously, since the density distributions of Na + and Cl − are mutually dependent [51].
The positional ion-ion, and water-ion, and water-water correlations are characterized by the corresponding radial distribution functions (RDFs), which are shown in Fig. 3 (ion-ion, water-ion) and in the Supplementary Material (water-water).In all cases, the AdResS simulation is able to reproduce the local structure of the corresponding sub-domain.Some deviations are observed for the ion-ion RDFs due to poorer statistics.  (1 and η (2) (defined by Eqs. ( 2) and (3), respectively) as a function of the radial distance from the DNA's CoM.We compare the AdResS simulation results with the all-atom solvation results.The results are shown for the AT and HY region with vertical dotted line denoting the boundary.The error bars represent the standard deviation of the measurements.

DNA molecule in the multiscale salt solution
The stability of the DNA structure was evaluated by means of the root-mean-square deviation (rmsd) and the root-mean-square fluctuations (rmsf) of the backbone atoms with respect to the average structure.We found that the multiscale approach has a negligible impact on the averaged DNA structure (data shown in the Supplementary Material).
To determine the structural properties of the solvent around the CoM of the DNA molecule we plot in Fig. 4 (left panel) the water oxygen, Na + , and Cl − NDPs.The AdResS simulation reproduces the surrounding solvent's density of the corresponding all-atom simulation within the error bars.The correlation hole, i.e., distance from the DNA's CoM to which the solvent density is perturbed by the DNA, is within the AT region.The NDPs thus demonstrate that the size of the AT region radius of 1.8 nm was set appropriately.
The average orientation of water molecules is examined by considering the lowest two orientational order parameters η (1) and η (2) , defined as where α denotes the angle between the dipole moment of water molecule and the normal vector pointing towards the CG region.The right panel of Fig. 4 shows both order parameters after binning the water molecules according to their distance form the DNA's CoM.A random orientation of water molecules corresponds to η (1,2) = 0.
The European Physical Journal Special Topics  As expected, far away from the DNA, water molecules have no preferred orientation, while in the vicinity the water molecules are highly oriented.The profiles of the AdResS and all-atom simulations match very well in this respect.Slight orientational ordering is observed at the AT/HY interface.This artifact, arising from the resolution change, was observed also in previous studies [25,38,58].
To further characterize the interactions between the DNA molecule and the solvent we investigate the hydrogen bonding between the DNA and water.The average number of the observed hydrogen bonds and the average lifetime are shown in Fig. 5.
Hydrogen bonds can be defined in several ways, e.g., a machine learning definition [59].Here, we employ the standard geometric criterion for the hydrogen bond [60,61], where two atoms are considered to be hydrogen bonded if the donoracceptor distance is < 0.35 nm and the donor-hydrogen-acceptor angle is < 30 • .We regard the nitrogen and oxygen atoms of DNA bases and backbone as the eligible candidates for hydrogen bonding.The average lifetime of a hydrogen bond is computed from the decay of the autocorrelation function.
From the results shown in Fig. 5 we can conclude that both solvations-the bundled-SPC water and the multiscale solvation-give comparable results.A high number of hydrogen bonds is found for the backbone O1P and O2P atoms.However, these are rather short lived.The number of hydrogen bonds in the grooves is quite uniform.The hydrogen bond lifetimes are longer in the minor groove region.

Effect of bundling
The bundled-SPC water model has somewhat different properties compared to the original SPC water model [24,26].To assess the changes induced by the bundling, we replot (Fig. 6) the NDPs of the solvent and water orientation around the DNA  6.Normalized density profiles with standard deviations of water oxygen, sodium, and chloride and water orientational order parameters η (1) and η (2) around the CoM of the DNA molecule.The results are shown for the all-atom bundled-SPC and free SPC solvation.The presentation is the same as in Fig. 4.
molecule for the all-atom bundled-SPC and free SPC solvation.We find significant differences between the two solvation models.In particular, we observe smoother and less structured NDPs in the case of the original (free) SPC water model.The oxygen atoms of the water molecules and sodium ions are also able to penetrate further into the DNA interior.Similarly, the orientational profiles η (1,2) also differ, with less pronounced orientational ordering in the case of the free SPC water model.
To obtain further insight into the different response of bundled versus normal SPC waters, we compute the geometrical order parameter of water, i.e., its tetrahedrality Q 4 , that describes the hydrogen bond network connectivity and is defined by [62] where i is the oxygen atom of the reference water molecule and θ ijk the angle between vectors r ij and r ik , while j and k are the oxygen atoms of nearest neighbors of central water i.The sum runs over distinct pairs of the four closest neighbors, i.e., over six oxygen-oxygen-oxygen angles.
The parameter Q 4 quantifies the local geometric tetrahedral order of the first solvation shell of a given molecule.The value of Q 4 = 1 corresponds to a perfect tetrahedral arrangement, whereas Q 4 = 0 describes an ideal gas.In Fig. 7 we first notice that the SPC and bundled-SPC water models have disparate values of the mean Q 4 far away from the DNA molecule.Specifically, we obtain the values of Q bulk 4 = 0.56 and 0.46 for the SPC and bundled-SPC solvations, respectively.The presence of halfharmonic bonds between oxygen atoms within bundles thus somewhat distorts the hydrogen-bond network of water.Some forethought is needed for the calculation of tetrahedrality at the close proximity to the DNA.In this region the water molecules will very likely have a DNA  atom as one of the four nearest neighbors.Therefore, if four nearest water molecules were considered in Eq. ( 4) the obtained value would be low, as we would include in the calculation also the water molecules in the second solvation shell.Because of this, we calculate the tetrahedrality order parameter by considering four nearest neighbors irrespective of their identity, i.e., whether the neighbors pertain to water molecules or DNA atoms (analogous approach was used in Ref. [44]).The results, shown in Fig. 7, reveal that the thus defined tetrahedral hydrogen bond network structure is almost unperturbed.The DNA is then able to substitute for water in the hydrogen-bond network without breaking the local tetrahedral structure of water.This effect can also be seen from the distributions of the tetrahedrality order parameter (shown in the Supplementary Material).
We observe discrepancies also in the structural properties of the free SPC and bundled-SPC solvations (see RDFs in the Supplementary Material).These are most noticeable for the water-water RDFs, whereas the ion-ion and water-ion RDFs are quite comparable, indicating that the local structure of ions is only slightly disturbed by the bundling.Interactions between water and DNA are also mostly unaltered.In particular, for both solvations we observe very similar hydrogen bonding pattern (shown in the Supplementary Material).

Conclusions
In this article, we derived a multiscale model for 1 M NaCl salt solution where bundled-SPC water model was coupled to the MARTINI CG force field.A multiscale solvent model was utilized to simulate a DNA molecule, considered at the full atomistic scale in the bathing salt solution.With our analysis of the structural and dynamical properties of the DNA and the surrounding solvent we demonstrate that the AdResS simulation is able to reproduce (in the AT region) all properties of the reference all-atom simulation.Additionally, the simulations were compared with the all-atom simulation where the standard (unrestrained) SPC water model is used.As such, our work further extends the current applicability and analysis of the bundled-SPC water model.We find that some properties (such as hydrogen bonding network and water reorientation) match very well, whereas there is some mismatch in others (such as density profiles and orientational order parameters).Nonetheless, the bundled-SPC water model facilitates the 4-to-1 coupling and is thus able to link the all-atom force field with the MARTINI force field.A possible future improvement would be to gradually weaken the constraints acting on the SPC bundle close to the molecule of interest.With the broad parameterization of the MARTINI model and available user-friendly tools, such as MARTINI insane [63] and CHARMM-GUI Martini Maker [64], the possible future applications appear numerous.For example, our dual-resolution model is able to provide additional physical insight into the extent of hydrogen bond network in the solvation shells of solute molecules [16,65].In our case, the atomistic water is coupled to the charge neutral MARTINI water model that is non-polar.By varying the size of the atomistic region one can determine the distance from the DNA molecule at which the inclusion of explicit atomistic water is necessary.Another potential application of our model is to study large scale drug binding to DNA.The CG solvent could contain many (potentially different) drugs, that could compete for binding the DNA (and in doing so they will change their resolution on-the-fly), possibly replacing ions.We would like to emphasize that the coupling of the salt solutions allows similar applications (hydration shell as well as drug binding) in case of other biomolecules.Further broadening the application range is the possibility of dual-resolution representation of not only the solvent but also of the biomolecules themselves, as is done, for example, in the model of Pantani and co-workers [66,67].

Fig. 1 .
Fig. 1.Schematic cross section of simulation box with cylindrical resolution regions: atomistic (AT), hybrid (HY), and coarse-grained (CG).Two levels of resolution are used for solvent molecules.AT level of resolution is used for proximal solvent molecules, within a certain radius from the DNA's CoM.Further away, the distal water molecules are represented as single beads (blue).The Na + and Cl − atoms are shown in orange and yellow, respectively.The high resolution cylindrical region moves with the DNA's CoM, which ensures atomistic modeling of the DNA molecule and the surrounding layer of water at all times.
Fig.3.Radial distribution functions (RDFs) of sodium-oxygen, chloride-oxygen, sodiumsodium, chloride-chloride, and sodium-chloride for the bulk solvent system.The AdResS RDFs match well the reference all-atom and coarse-grained simulation results.Labels "AdResS AT" and "AdResS CG" denote the AdResS domain, i.e. the atomistic and coarsegrained regions, respectively.

Fig. 4 .
Fig.4.Left: water oxygen (top), sodium (middle), and chloride (bottom) NDPs with standard deviations around the CoM of the DNA molecule.Right: water order parameters η(1)  and η(2) (defined by Eqs.(2) and (3), respectively) as a function of the radial distance from the DNA's CoM.We compare the AdResS simulation results with the all-atom solvation results.The results are shown for the AT and HY region with vertical dotted line denoting the boundary.The error bars represent the standard deviation of the measurements.

Fig. 5 .
Fig. 5. Average number (top) and lifetime (bottom) of hydrogen bonds occurring between the DNA atoms and water.The results are shown separately for the selected electronegative nitrogen and oxygen atoms of the DNA.
Fig.6.Normalized density profiles with standard deviations of water oxygen, sodium, and chloride and water orientational order parameters η(1) and η(2) around the CoM of the DNA molecule.The results are shown for the all-atom bundled-SPC and free SPC solvation.The presentation is the same as in Fig.4.

Fig. 7 .
Fig. 7. Tetrahedrality order parameter Q4 as a function of the radial distance from the DNA's CoM and distance to nearest DNA atom (inset).The results are shown for the AT and HY regions with vertical dotted line denoting the boundary.The error bars represent the standard deviation of the measurements.