A tool for the morphological analysis of mixtures of lipids and water in computer simulations

Fuhrmans, Marc; Marrink, Siewert-Jan

doi:10.1007/s00894-010-0858-6

A tool for the morphological analysis of mixtures of lipids and water in computer simulations

Original Paper
Open access
Published: 31 October 2010

Volume 17, pages 1755–1766, (2011)
Cite this article

Download PDF

You have full access to this open access article

Journal of Molecular Modeling Aims and scope Submit manuscript

A tool for the morphological analysis of mixtures of lipids and water in computer simulations

Download PDF

Marc Fuhrmans¹ &
Siewert-Jan Marrink¹

5771 Accesses
4 Citations
3 Altmetric
Explore all metrics

Abstract

When analyzing computer simulations of mixtures of lipids and water, the questions to be answered are often of a morphological nature. They can deal with global properties, like the kind of phase that is adopted or the presence or absence of certain key features like a pore or stalk, or with local properties, like the local curvature present at a particular part of the lipid/water interface. While in principle all of the information relating to the global and local morphological properties of a system can be obtained from the set of atomic coordinates generated by a computer simulation, the extraction of this information is a tedious task that usually involves using a visualization program and performing the analysis by eye. Here we present a tool that employs the technique of morphological image analysis (MIA) to automatically extract the global morphology—as given by Minkowski functionals—from a set of atomic coordinates, and creates an image of the system onto which the local curvatures are mapped as a color code.

Variational Methods for Biomolecular Modeling

Martini 3: a general purpose force field for coarse-grained molecular dynamics

Article 29 March 2021

Effects of Coarse Graining and Saturation of Hydrocarbon Chains on Structure and Dynamics of Simulated Lipid Molecules

Article Open access 13 September 2017

Introduction

Motivation

With the development of new models and the steady increase in available computing power, computer simulations have become more and more valuable in the study of lipid systems. While the exact conformations of individual lipid molecules are of interest for some applications, most of the time the focus is on the behavior of aggregates of lipids as a whole. Recent examples have been reviewed in [1].

In many of these studies, at some point during the analysis of the simulation, a morphological property of the system—i.e., a property that solely depends on the shape of the lipid aggregate—needs to be characterized. For the more general properties, like the phase adopted and the presence or absence of stalks or pores, the task at hand can be accomplished by loading the obtained coordinates into a visualization program and performing the analysis by eye, but analyzing a large number of simulations in this way can be a tedious task. For the determination of more specific, quantitative properties like the interface area, volume and curvatures, such a naive approach is largely impossible.

One possible way to automate morphological analyses of trajectories generated by computer simulations is to use the technique of morphological image analysis [2] to extract morphological information in the form of Minkowski functionals [3]. This approach has been used to study, e.g., a pore distribution [4] and membrane fusion events during a phase transition [5], as well as to monitor the self-assembly of vesicles [6]. Another approach is to describe morphological features as persistent voids based on the theory of alpha shapes [7] and persistent homology [8], which has been applied to characterize vesicle fusion [9]. However, no implementation of either method is currently readily available to the majority of researchers—none are included in any of the widely used molecular dynamics software packages.

Here, we present an extension of the Gromacs software package [10] that enables the morphological image analysis of molecular aggregates. In addition, an option to extract local curvatures has been added to the method which, to the authors’ best knowledge, has not been employed before, at least in the field of lipid aggregates.

Theory

In three dimensions, there are four Minkowski functionals corresponding to the volume whose morphology is to be determined, the area of the interface separating that volume from the rest of the system, and the integrated mean and Gaussian curvatures of that interface.^{Footnote 1} As such, both geometrical (shape) and topological features (connectivity) are characterized.^{Footnote 2}

For black and white digital (i.e., pixelated) images, the process used to extract the Minkowski functionals is well established and can be accomplished by simply counting the pixels and pixel components of lower dimensionality that comprise the image. This means, that for three-dimensional pictures, one only needs the number of voxels^{Footnote 3} and the number of faces, edges and vertices which these voxels consist of, where voxel components shared by several voxels are counted only once. The Minkowski functionals can then be obtained as sums over these numbers, as given in Table 1. A way of obtaining the morphology of a set of coordinates is therefore to translate the system into a three-dimensional image composed of black and white voxels [2].

Table 1 The relation between volume V, surface area A, mean breadth B, Euler characteristic χ, integrated mean curvature H, integrated Gaussian curvature K, voxel edge length ξ, and the numbers of cubic voxels n _c, faces n _f, edges n _e and vertices n _v that define the positive space

Full size table

The advantages of this method are the straightforwardness of its implementation and its rigorousness in the sense that the resulting numbers are the exact values of the Minkowski functionals for the image. Its only disadvantage is therefore the approximation introduced by the image itself. The use of voxels entails a limitation to right angles, which imposes restrictions on the values for the surface area and integrated mean curvature obtained with this method, causing several structures to share the same value. As an example, removing any voxel from a cube of eight voxels will leave the surface area and integrated mean curvature unchanged, resulting in a general tendency to overestimate these functionals.

However, the Euler characteristic—which only requires the connectivity to be identical for the image and the original system—can be determined exactly, and the volume can be obtained with only slight errors that can be minimized by choosing a sufficiently high resolution.

For a broad spectrum of morphological tasks, the values obtained are sufficient, even with the restrictions mentioned above. For most applications concerning molecular aggregates, the Euler characteristic and the integrated mean curvature are arguably the most important values. Purely topological analyses, including both phase determination and the detection of stalks or pores, rely primarily on the Euler characteristic, which is not affected by the limitations of morphological image analysis. In addition, due to the systematic nature of the error in the integrated mean curvature, the value obtained can still be used to extract morphological information. The absence of mean curvature is accurately recognized as zero mean curvature, and systems with positive can be distinguished from those with negative total mean curvature. In addition, both the integrated mean curvature and the surface area can be used to further characterize structures within families with similar topologies, since the lack of absolute values is not detrimental to relative comparisons.

As an extension to this basic application of morphological image analysis, it is also possible to obtain local values of the mean and Gaussian curvature. As has been shown by Hyde et al. [11], every surface vertex can be associated with a certain mean and Gaussian curvature. Again, these values are exact for the image, and summation over all surface vertices while taking into account the different surface areas associated with each vertex leads to global (integrated) values for the mean and Gaussian curvatures which are identical to those obtained with the method described above. Mapping the local curvatures onto the image as a color code allows further characterization of the structure at hand, enabling the easy detection of areas with different curvatures, as well as detailed comparison of similar structures.

The rest of this article is organized as follows. In the sections “Implementation” and “User-definable options and parameters,” details about the implementation and the user-definable parameters are given, while “Simulation setup” describes the parameters used in the simulations that were analyzed in order to test our program. The “Results” section provides the results of these sample applications, in addition to results of tests performed on model systems.

Methods

Implementation

The implementation discussed in this publication was realized using the Gromacs 3.3 software package [10], but should in principle compile with any version of Gromacs from 3.0 to date, with only minor modifications. The executable is called g_mia and was written in the C programming language. The source code is available upon request. Acceptable input file formats are the standard formats supported by Gromacs.

Basic algorithm

We treat the image as a three-dimensional cubic grid representing the simulation box, onto which every coordinate is mapped.^{Footnote 4} To avoid any artificial empty spaces caused by representing atoms (or groups of atoms in the case of coarse-grained models) by their centers of mass only, every coordinate is expanded into a spherical cloud of coordinates, each of which is mapped onto the grid individually.^{Footnote 5} Depending on the type and number of particles mapped to it, cells are declared to be either positive or negative, where positive cells represent the molecular aggregate. The global values of the Minkowski functionals can then be obtained by counting the number of cubes, cube faces, edges and vertices, taking into account the periodic boundaries.

For the local values of the mean curvature and Gaussian curvature, every surface vertex^{Footnote 6} is identified as being one of the possible cases listed in Fig. 1, and the corresponding local curvatures given by the product of the interface area and the curvature value associated with that type of surface vertex are stored. However, we wish to map the curvature to voxels, not vertices. To that end, nonsurface voxels (i.e., positive voxels that do not contribute a single face to the interface) are eliminated. The stored curvatures of the surface vertices are then distributed equally among the surface voxels adjacent to that particular vertex, as illustrated in Fig. 2.

To visualize the local curvatures, a PyMOL [12] file is generated that represents the image as voxels onto which the curvatures are mapped as a color code. Due to the different ranges of curvatures encountered, it is impossible to use a fixed color scale. We therefore employ a two-color scheme in which white corresponds to a curvature of zero while the two colors are used to distinguish negative and positive curvatures, with the intensity of the color indicating the value. Full intensity is assigned to the voxel(s) with the maximum absolute curvature encountered in a given system, and the color range is symmetric in the sense that full intensity indicates the same (absolute) value for both colors. While this means that every image has its own color code, it is the most efficient scheme to highlight differences in local curvature.

Optional steps

The data generated can often be improved considerably by performing some image manipulation steps and averaging.

Image manipulation

Depending on the particle density in the coordinate file and the desired resolution of the grid, it is possible to include an image manipulation step right after the creation of the image. In this step, isolated clusters of either positive or negative cells below a certain size are interpreted as noise and removed. Performing this step also allows the number of actual isolated clusters above the threshold size to be determined at no additional cost, which is useful morphological information in its own right.

Spatial averaging

Due to the fixed nature of the grid, even aggregates with perfectly homogeneous curvature, like a sphere, will display different curvatures for different regions, depending on how well the rasterization of the image fits the surface in that region. In general, the curvature tends to be underestimated when the surface is aligned with the grid, and overestimated when it is diagonal to the grid.

Two spatial averaging options can be employed to reduce this effect. First, the local curvature obtained can be averaged over neighboring surface voxels within a certain distance. In addition, it is possible to further improve the results by determining local curvatures for multiple grid orientations. In this case, the resulting curvature values of each positive surface voxel for every orientation are stored together with the coordinate corresponding to the center of that voxel rotated back to the original orientation. The values of all rotations are then mapped back onto the original grid, averaging the values over the entries mapped onto the same cell. If needed, the resulting values can be averaged over neighboring cells. Since it is not possible to preserve the periodic boundary conditions with a rotated grid, the area of interest is centered in the box, and only cells within a certain distance from the center (i.e., cells that lie within both the volume of the box and the rotated grid for all rotations) are taken into account.

Time averaging

While not included as such in the current version of the presented tool, it can also be useful to average the curvatures over time (i.e., over several snapshots of a trajectory). For the global values, this can easily be accomplished after analysis by taking the floating average of the calculated curvatures. For the local values, time averaging can be performed at the coordinate level prior to the analysis, effectively yielding time-averaged curvatures.

User-definable options and parameters

It is not generally possible to use the same set of parameters for the analysis of all possible structures and representations. The implementation therefore allows most parameters to be determined by the user. This section describes the parameters and discusses what to consider to achieve the optimal results. The corresponding command line options are given in parentheses.

Input files

The tool needs a coordinate or trajectory file (-c) and an index file (-n) in which the particles that correspond to the positive phase are listed.

Imaging options

The edge length of the grid (-dim), the radius of the spherical cloud used to expand the coordinates (-sr), and the number of coordinates generated during the expansion (-npts), as well as the minimum number of coordinates mapped onto a grid cell required to count it as positive (-thresh1) need to be specified.

As a general consideration, the resolution needs to be high enough to accurately depict the structure to be analyzed, but is limited by memory requirements, due to the need for several three-dimensional arrays during the computation.^{Footnote 7} In addition, using a high resolution usually requires the expansion of the coordinates in order to avoid the creation of artificial empty voxels due to the limited coordinate density, which partially offsets the desired high resolution. The radius of the spherical cloud should therefore be chosen as the smallest radius sufficient to avoid noise. (An example of the effects of the chosen resolution for a sample application is given in “Applications,” Table 2)

Table 2 Average values of the volume V, the surface area A, the integrated mean curvature H, and the Euler characteristic χ extracted from the simulation of a porated membrane in relation to the resolution (identified by the edge length d of the grid) and the radius of the spherical cloud r _S used for the expansion of the spheres. Note that, in order to ensure that no effects are masked, cluster filtering was not applied to the images

Full size table

It also turns out that, in order to accurately detect flat morphologies with zero mean curvature, it is necessary to calibrate the parameters used. Since molecular aggregates usually have low short-range order, fluctuations of individual molecules from the mean will show up as either bumps or dents in the created image. Since a given resolution does not necessarily have the same propensity to produce bumps as it does to produce dents, a net curvature will be measured. The threshold parameter can be used to adjust the number of “positive” coordinates that must be mapped onto a single grid cell to count that cell as positive in order to (on average) produce an equal number of bumps and dents, thus ensuring that an artificial mean curvature is not introduced into the measurement.

In addition, it is also possible to use the coordinates of the particles corresponding to the negative phase—mapping them onto the grid as described above, but counting them as negative instead. If that is desired, the number of phases to consider must be set from 1 to 2 (-np), and the index file needs to contain a second group in which these particles are listed.

If isolated clusters below a certain size are to be removed (see above), the maximal cluster size that is considered noise must be specified (-cs).

Averaging options

The range over which the local curvatures are averaged over neighboring voxels needs to be specified (-ar1 and -ar2), with a value of zero indicating no averaging. Two values are needed, one for the averaging of every single grid orientation (-ar1) and one for the averaging performed after the values of all grid orientations have been collected (-ar2).

If multiple grid orientations are to be used, the number of rotations around every axis (-nx, -ny and -nz) and the corresponding angle increments (-depsilon, -dphi and -dtheta), as well as the radius around the center of the box within which the voxels are considered must be set (-dr).^{Footnote 8} In order to achieve the best result, care must be taken to avoid sampling similar orientations.

In addition, it is possible to specify a threshold which ensures that voxels are only counted as positive if a minimum number of local curvatures corresponding to different rotations have been mapped onto that voxel (-thresh2). However, unlike the other averaging steps, this option will discard curvature and does not yield exact results, and should therefore be used with care. For the results presented in this work, a threshold of zero has been used, effectively disabling this option.

For the results discussed in the “Results” section, the grid resolution and the radius used to expand the coordinates will be given, along with the number of rotations and the distance used to average the local values.

Simulation setup

The simulations shown in this article were performed using the coarse-grained MARTINI model [13] with the Gromacs 3.3 software package [10], employing the standard run parameters for the MARTINI model at a timestep of 40 fs. Both pressure and temperature were coupled to a reference value using the Berendsen scheme [14]. Lennard-Jones and Coulomb interactions were obtained at every step for particles occurring within a cut-off of 1.2 nm according to a neighbor list that was updated every 10 steps. The Lennard-Jones and the Coulomb potentials were modified with a shift function to ensure that the interactions vanished smoothly at the cut-off. Electrostatic interactions were screened with an effective dielectric constant of 15 (which is the standard value for the MARTINI model).

Three processes were used as sample applications: spontaneous aggregation of lipids into a lipid bilayer, closure of a pore in a membrane, and stalk formation between apposed lipid bilayers (with setups similar to those used for the simulations described in [15–17], respectively).

Spontaneous aggregation

The system simulated consists of 256 DOPE (dioleoylphosphatidylethanolamine) lipids with 768 water beads (with one bead corresponding to four water molecules), starting from random coordinates. The simulation was carried out at a reference temperature of 315 K, with a coupling time constant of 0.5 ps, anisotropic pressure coupling, compressibilities of 5 × 10⁻⁵ bar⁻¹ for the diagonal elements and 1 × 10⁻⁷ bar⁻¹ for the off-diagonal elements of the pressure tensor, coupling time constants of 1.2 ps, and reference pressures of 1.0 bar.

Porated membrane

The system consists of a bilayer of 128 DPPC (dipalmitoylphosphatidylcholine) lipids with a preformed pore at excess hydration (2653 water beads). After a short equilibration, the simulation was carried out at a reference temperature of 323 K with a coupling time constant of 1.0 ps, semi-isotropic pressure coupling with a compressibility of 1 × 10⁻⁵ bar⁻¹, a coupling time constant of 1.0 ps, a reference pressure of 1.0 bar for the direction perpendicular to the bilayer, and a compressibility of 0 bar⁻¹ for the plane containing the bilayer.

Stalk formation

The initial configuration was two bilayers of 98 DOPE lipids each, separated by two slabs consisting of 65 water beads each, corresponding to an effective hydration level of 2.65 water molecules per lipid. To induce the formation of stalks, the simulation was carried out at a reference temperature of 375 K with a coupling time constant of 0.5 ps, semi-isotropic pressure coupling with a compressibility of 1 × 10⁻⁵ bar⁻¹, a coupling time constant of 1.2 ps, and a reference pressure of 1.0 bar for all directions.

Results

Model systems

The method was first tested on two artificially constructed model systems with very high coordinate densities: a solid sphere and a toroidal pore. This allowed the potential of the method to be assessed by analyzing virtually noise-free structures, and meant that the exact values for these ideal geometries were available for comparison. Plots of the coordinates of the model systems used are depicted in Figs. 3 and 4.

Spheres

Figure 5 shows the measured and theoretical values of the Minkowski functionals for solid spheres of different radii. The image was constructed with a resolution of 0.4 nm and by expanding the coordinates into spheres of 0.2 nm. As predicted for a solid object, the Euler characteristic is obtained with a value of exactly 1. The volume of the image is only slightly higher than that of the original, which is due to the rasterization of the image and the expansion of the coordinates into spheres. However, the surface area and integrated mean curvature are overestimated to a larger extent. In fact, the values obtained lie between the values of the sphere and a cube with an edge length identical to the diameter of the sphere (see discussion in “Theory”). Nevertheless, the values are proportional to the values of the original and could therefore be used in principle to distinguish between spheres with different sizes.

To calculate the local curvatures, eight rotations around every axis were used, and the values were averaged over neighboring voxels up to a distance of three grid cells. Looking at the mapping onto the image shown in Fig. 6, we can see that both the mean and the Gaussian curvatures are accurately mapped with positive values. While the mean curvature is correctly mapped almost homogeneously over the whole surface, the distribution of the Gaussian curvature for the larger sphere is less even, even with averaging performed. This is a symptom of a general difficulty with mapping the Gaussian curvature that was found in most of our measurements for systems which display large areas of homogeneous Gaussian curvature.^{Footnote 9} However, while this behavior might seem problematic at first, it is partially due to the color scale employed, which assigns full color intensity to the voxel with the highest absolute curvature (see “Implementation”). In the presence of regions with high Gaussian curvature (as in the example of the smaller sphere), these are accurately detected, and artificial fluctuations in regions of lower Gaussian curvature become relatively less important as well as less visible in our depiction.

Toroidal pores

Figure 7 shows the values of the Minkowski functionals for a toroidal^{Footnote 10} pore through an 8.8 × 8.8 nm² layer of 4.0 nm thickness as a function of the pore radius,^{Footnote 11} obtained using a grid size of 0.2 nm and by expanding the coordinates to a radius of 0.1 nm. In addition, the analytical values for the volume V, surface area A, and integrated mean curvature H are plotted:^{Footnote 12}

$$ V = {V_{\rm{slab}}} - {V_{\rm{cyl}}} + {\pi^2}{d^2}\left( {d + r} \right) - \frac{4}{3} \pi {d^3}, $$

(1)

$$ A = {2}\left( {{A_{rec}} - {A_{circ}}} \right) + {2}{\pi^2}d\left( {d + r} \right), $$

(2)

$$ H = {\pi^2}\left( {d + r} \right) - {4}\pi d. $$

(3)

In these expressions, d is half the thickness of the slab, r is the radius of the pore at its smallest extension, A _rec is the area of the bottom or top of the unporated slab, A _circ is the area of the circle with radius d + r, V _slab is the volume of the unporated slab, and V _cyl is the volume of the cylinder with a height of 2d and a radius of d + r.

As before, the Euler characteristic is obtained with the exact value of −1, and the volume of the image is higher but proportional to that of the original. The surface area is overestimated to a larger extent, again showing how the area of a curved surface is increased by the rasterization of the image. The fact that the surface area of the image is actually found to increase over the whole range of radii, in contrast to the values calculated for the original, reflects the increasing percentage of the total surface that is curved for larger pore radii. This causes the slight decrease in the surface area in the original geometry to be overshadowed by the overestimation of areas of curved surfaces in the image.

The integrated mean curvature shows the same general trend for both image and original, but the amount of negative curvature is higher in the image for the measured range of radii. This causes small pores to display negative values for radii of up to 1 nm, while the actual crossover point for the original geometry occurs at approximately 0.5 nm. In addition, it becomes apparent that the values obtained by morphological image analysis are discrete and not continuous,^{Footnote 13} causing small changes in curvature in the original geometry to go unnoticed in the image.

The local curvatures were calculated using four orientations for each axis and by averaging over neighboring voxels up to a distance of five grid cells. Looking at the mapping onto the image shown in Fig. 8, the dominance of negative mean curvature for pores of small radii found in the global values is also visible. The mean curvature is accurately found to be minimal in the midsections of the pores, reflecting the fact that the highest negative principal curvature is located in that region, and maximal close to the rim, reflecting the fact that the lowest negative principal curvature occurs in that region,^{Footnote 14} and is in fact accurately found to be approximately zero in the midsection of the pore of radius 2.0 nm (for this radius and a layer thickness of 4.0 nm, the two principal curvatures cancel in this region). In addition, it becomes more positive overall for higher pore radii, in accordance with the lower negative principal curvature. The Gaussian curvature is also found to be accurately mapped, with the maximum (negative) curvature found in the midsection, and the curvature gradually decreasing to zero the closer one gets to the rim for the two bigger pores. The minimum Gaussian curvature in the midsection is only not detected for the smallest radius, due to the pore size being close to the limit of the resolution used. In principle, this problem could be avoided by using a higher resolution.

It is worth mentioning that the negative spaces of the images of the ideal toroidal pores are images of a stalk. The corresponding stalks will therefore have identical Gaussian curvature and surface area to the pores, but the sign of the mean curvature will be inverted. For the global values, it can therefore be deduced that stalks are accurately characterized as having negative mean curvature if one considers that stalks have a certain minimum radius given by the lipid tail length (approximately 2.0 nm for a typical lipid tail of 16–18 carbon atoms).

Applications

Next we tested our method with trajectories and snapshots taken from actual simulations of lipids. For these, it proved advantageous to define the positive phase as only the atoms or beads corresponding to the lipid tails. This allowed details like pores to be amplified and stalks to be distinguished from configurations in which two membranes are close but there is no contact between the hydrophobic cores.