Predicting Supramolecular Structure from the Statistics of Individual Molecular Events

Pilkiewicz, Kevin R.; Rana, Pratip; Mayo, Michael L.; Ghosh, Preetam

doi:10.1007/s11036-022-02057-8

Predicting Supramolecular Structure from the Statistics of Individual Molecular Events

Open access
Published: 01 December 2022

Volume 28, pages 491–498, (2023)
Cite this article

Download PDF

You have full access to this open access article

Mobile Networks and Applications Aims and scope Submit manuscript

Predicting Supramolecular Structure from the Statistics of Individual Molecular Events

Download PDF

Kevin R. Pilkiewicz ORCID: orcid.org/0000-0002-3278-2268¹,
Pratip Rana²,
Michael L. Mayo¹ &
…
Preetam Ghosh²

1314 Accesses
Explore all metrics

Abstract

As manipulating the self-assembly of supramolecular and nanoscale constructs at the single-molecule level increasingly becomes the norm, new theoretical scaffolds must be erected to replace the thermodynamic and kinetics based models used to describe traditional bulk phase syntheses. Like the statistical mechanics underpinning these latter theories, the framework we propose uses state probabilities as its fundamental objects; but, contrary to the Gibbsian paradigm, our theory directly models the transition probabilities between the initial and final states of a trajectory, foregoing the need to assume ergodicity. We leverage these probabilities in the context of molecular self-assembly to compute the overall likelihood that a specified experimental condition leads to a desired structural outcome. We demonstrate the application of this framework to a toy model in which N identical molecules can assemble into oligomers of different lengths and conclude with a discussion of how the high computational cost of such a fine-grained model can be overcome through approximation when extending it to larger, more complex systems.

Self-Assembly from a Single-Molecule Perspective

Kinetic constraints on self-assembly into closed supramolecular structures

Article Open access 25 September 2017

Mapping the energy landscapes of supramolecular assembly by thermal hysteresis

Article Open access 08 August 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Just as verbal communication is facilitated by the grammatical structure of sentences, which are themselves built up from combinations of words, each of which is a specific sequence of letters; biomolecular communication is principally driven by the folded structures of proteins, which are assembled through electrostatic binding interactions between polypeptide sequences, each generated from the alphabet of amino acids. The careless verbal malapropism that can completely change the meaning of a sentence has a biomolecular analogue as well in the form of misfolding events, which can at best render a protein functionally inert. At worst, the misfolding of proteins can cause great harm, such as in the cases of the neuronal amyloid-beta (Aβ) protein and the pancreatic amylin, whose misfolded structures can seed the formation of plaques that have been implicated as a potential cause for Alzheimer’s disease and type II diabetes, respectively [1].

If even nature has yet to achieve perfectly reliable molecular self-assembly after billions of years, it is hardly surprising that human efforts to synthesize nanoscale structures have likewise struggled with exerting precise structural control. Gold nanoparticle [2] and liposome [3] syntheses have difficulty achieving acceptable levels of monodispersity, biofilm [4] and other monolayer surface depositions [5] are prone to disorder and defects, and supramolecular assemblies [6] are often plagued by competing interactions that lead to disparate products. The problem with these experimental techniques is that, like most naturally ocurring forms of synthesis and self-assembly, they rely upon a presumption of ergodicity. Essentially, ergodicity is the idea that, given enough time, a group of molecules will eventually manage to assemble into the configuration that minimizes their free energy–possibly only after stochastically exploring many less favorable structural outcomes. Many of these alternate structures are only modestly less stable than the optimal structure, however, and the probability of a large enough thermal fluctuation enabling the improperly assembled molecules to break back apart and reassemble “correctly” is too negligible to be observed on experimentally acceptable time scales [7].

One way to tip the odds in the experimenter’s favor would be to stop relying upon diffusion to bring molecules together in the bulk phase and instead both limit the number of reacting molecules present and exercise control over their collisional trajectories in order to promote the formation of desirable products. Newer laboratory techniques such as optical and magnetic tweezers [8], molecular beams [9], and micro- and nanofluidics [10] all embody this approach towards improved synthetic control. While experiments have moved boldly in this new direction, theory has sadly lagged behind, with self-assembly still being modeled in ways that largely rely on statistical thermodynamics and kinetics [11,12,13]. This older modeling paradigm is insufficient to correctly predict the structural outcomes of experiments that involve only a few molecules and that by construction occur far from thermodynamic equilibrium. Theorists are essentially still using a thermal game of Plinko as a model for experiments that are more akin to molecular Tetris.

In this paper we attempt to address this gap by introducing a new modeling framework for the probability of observing a desired molecular product that derives from considerations of the stochastic dynamics of a single self-assembly sequence and its branching structural end states. The result is a probabilistic model that is more appropriate for predicting the outcomes of the newer generation of “single-molecule” experiments and that requires neither an ergodic hypothesis nor a thermodynamic limit. A short version of this work appeared in [14].

2 The model

We imagine a generalized experimental setup in which molecular subunits are injected into a system and propelled by some means towards a targeted site of self-assembly. This site could, for example, be a surface where a monolayer is intended to be grown, or it could be a single molecule or a partially assembled “seed” structure suspended in an optical or magnetic trap. Typically the experimenter can only directly control external parameters such as the temperature, the molecular emission frequency, or the distance to the self-assembly site. We denote this set of tunable parameters as σ, understanding that the specific members of this set will depend upon the precise details of the experiment.

When each molecule is emitted into the system, it will have some initial state characterized by dynamic variables such as its velocity, orientation, quantum vibrational state, and the time of its emission relative to the start of the experiment. For the i^th emitted molecule, we define this initial state as ψ_i. The set of the initial states of all the molecules involved in the assembly procedure of interest will be denoted {ψ}, and the distribution of this set of states will depend upon the tuning parameters σ through a conditional probability distribution function p_S({ψ}|σ), where the subscript S denotes this as the “source” probability.

At some later set of times, the emitted molecules will, one-by-one, reach the self-assembly site; and their dynamical state will have evolved as a consequence of the stochastic transport process used to propel them there. We will denote this new set of dynamical states as {ϕ}, emphasizing that the arrival time of each molecule at the target site, which will also be a random variable, is included as part of {ϕ}. The distribution of this latter set of states will depend upon the initial set of states, and we denote the corresponding conditional probability p_T({ϕ}|{ψ}) as the “transport” probability.

In most self-assembling systems, there are numerous structural products that can be observed at long times depending on the specific details of the molecular collisions, i.e., the set of arrival states {ϕ}. If we denote the set of all these possible structures as S, then the conditional probability that any specific structure s ∈ S is observed will be p_R(s ∈ S|{ϕ}), which we denote the “result” probability. Note that this probability depends on the states of all the molecules that collide with the self-assembly site–even those that do not end up binding to become part of the final structure. For example, while more molecules must necessarily be involved to produce larger, more complex structures, one can imagine that bombarding the self-assembly site with new molecules at too fast a rate could actually serve to disrupt the assembly process, ultimately making smaller final structures more likely than larger ones.

The ultimate probability that we wish to compute is p_F(s|σ), the overall final probability that a set of input parameters will result in molecules assembling into structure s. This probability can be related to the latter three by the following convolution integral:

$$ p_{F}(s\vert\sigma) = \int \!d\{\phi\} d\{\psi\} p_{R}(s\vert\{\phi\})p_{T}(\{\phi\}\vert\{\psi\})p_{S}(\{\psi\}\vert{\sigma}). $$

(1)

The complexity involved in actually evaluating Eq. 1 will naturally depend upon the details of the system under consideration.

In this paper, we restrict our attention to a toy model that demonstrates how this theoretical framework might be applied and what sort of predictions it can be used to make. In this model, we assume that N identical molecules are released at randomly selected times into a one-dimensional drift-diffusion channel characterized by a drift speed v, a diffusion constant D, and a channel length ℓ. The first molecule to traverse the channel binds to a receptor site that catalyzes a self-assembly process with the second molecule to arrive, resulting in a dimer state. We assume that this assembly process takes a finite amount of time, which we denote as the assembly time T_α. If the next molecule arrives while dimer assembly is still occuring, it will be repelled, preserving the dimer state. If, on the other hand, it arrives once the dimer is complete, then a trimer will be formed. This process continues until all N molecules have traversed the channel and have either been repelled or polymerized. Figure 1 depicts a cartoon representation of our toy model and summarizes its possible outcomes in the simplest nontrivial case of N = 3.

Continuing to restrict our attention to the three particle scenario, we can label the three molecules 0, 1, and 2 based on the order in which they are released into the channel, and we may then define the initial state of the i^th particle ψ_i as its injection time τ_i. Its interaction state ϕ_i can be defined analogously as its arrival time t_i at the channel terminus. Because the first such arrival time may be thought of as the start of the experiment, the absolute release and arrival times are irrelevant, and we can replace these six time variables with four time intervals. We define the intervals Δτ₁ and Δτ₂ as the differences between the injection times of particle 1 and particle 0 and particle 2 and particle 0, respectively. Similarly, the intervals Δt₁ and Δt₂ are the equivalent differences in the arrival times of the particles at the self-assembly site. Because the order in which the particles arrive is not fixed, due to the stochasticity of the transport down the channel, these latter time intervals can potentially be negative.

The space of self-assembled structures S in this case contains only the dimer and trimer configurations, which we shall denote as s₂ and s₃, respectively. The probability of observing a dimer at the end of the experiment depends upon whether or not the third molecule to interact does so within a time T_α of the second molecule’s arrival. We can thus write the dimer result probability p_R(s₂|Δt₁,Δt₂) as the following conditional:

$$ p_{R}(s_{2}\vert {{\varDelta}} t_{1},{{\varDelta}} t_{2})= \begin{cases} 1-{{\varTheta}}(\vert{{\varDelta}} t_{2}-{{\varDelta}} t_{1}\vert-T_{\alpha}) & \text{ for } {{\varDelta}} t_{1}\geq 0, {{\varDelta}} t_{2}\geq 0 \\ 1-{{\varTheta}}(\vert{{\varDelta}} t_{2}\vert-T_{\alpha}) & \text{ for } {{\varDelta}} t_{2}\geq {{\varDelta}} t_{1}, {{\varDelta}} t_{1}< 0 \\ 1-{{\varTheta}}(\vert{{\varDelta}} t_{1}\vert-T_{\alpha}) & \text{ for } {{\varDelta}} t_{1}\geq {{\varDelta}} t_{2}, {{\varDelta}} t_{2}< 0. \end{cases} $$

(2)

In the above, Θ(t) is the Heaviside step function. We adopt the convention that it takes value unity when its argument exceeds zero and has value zero otherwise. Since the only other possibility is that a trimer is formed, p_R(s₃|Δt₁,Δt₂) = 1 − p_R(s₂|Δt₁,Δt₂).

Note that, regardless of the order of monomer arrival, if we define the time between the second and third monomer arrival events as Δt₂₁, then Eq. 2 can be rewritten more compactly as p_R(s₂|Δt₂₁) = 1 −Θ(Δt₂₁ − T_α). This simplification is a direct consequence of the three monomers being identical in our model. If they were instead each a different chemical species, the specific order of their arrival would impact the self-assembly process, and Eq. 2 would typically have a different functional form for each ordering.

The first passage time across a drift-diffusion channel is distributed according to the standard inverse Gaussian distribution IG(μ,λ;t) [15], analytically continued to be zero for negative values of its time argument:

$$ IG(\mu,\lambda;t)= \begin{cases} \sqrt{\frac{\lambda}{2\pi t^{3}}} \exp{\left[\frac{-\lambda(t-\mu)^{2}}{2\mu^{2} t}\right]} & t>0 \\ & \\ 0 & t\leq 0 \end{cases} $$

(3)

The parameter μ ≡ ℓ/v is the time it takes to cross the channel in the absence of diffusion, and λ ≡ ℓ²/2D is the average time it would take in the absence of drift. This suggests the following form for the transport distribution p_T(Δt₁,Δt₂|Δτ₁,Δτ₂):

$$ \begin{array}{@{}rcl@{}} &&p_{T}({{\varDelta}} t_{1}, {{\varDelta}} t_{2} \vert{{\varDelta}}\tau_{1}, {{\varDelta}}\tau_{2}) \\ &=&{\int}_{0}^{\infty}dt IG(\mu,\lambda;t) IG(\mu,\lambda;t+{{\varDelta}} t_{1}-{{\varDelta}}\tau_{1}) \\ &&\times IG(\mu,\lambda;t+{{\varDelta}} t_{2}-{{\varDelta}}\tau_{2}), \end{array} $$

(4)

where the time variable of integration t is defined as t₀ − τ₀, the total time it takes the first emitted monomer to traverse the channel.

Finally, we will assume that each molecule has an equal chance of being released into the channel at any moment in time after the previous molecule’s emission, which results in the release time intervals being exponentially distributed (as in a radioactive decay process). Assuming an average injection rate 1/τ, we get

$$ \begin{array}{@{}rcl@{}} p_{S}({{\varDelta}}\tau_{1},{{\varDelta}}\tau_{2}\vert\tau)&=&\frac{1}{\tau^{2}}e^{-({{\varDelta}}\tau_{2}-{{\varDelta}}\tau_{1})/\tau}e^{-{{\varDelta}}\tau_{1}/\tau} \\ &=&\frac{1}{\tau^{2}}e^{-{{\varDelta}}\tau_{2}/\tau}. \end{array} $$

(5)

Note that the dependence of this distribution on Δτ₁ cancels out of the exponent and that τ is the presumptive tuning parameter of the experiment.

Equations 2, 4, and 5 can be substituted into Eq. 1 to calculate the total dimer probability p_F(s₂|τ). The simple conditional form of the result probability p_R(s₂|Δt₁,Δt₂) will lead to a modification in the limits of integration over the arrival time intervals. This leads to a more complicated looking expression for p_F(s₂|τ) that is nonetheless more straightforward to evaluate numerically:

$$ \begin{array}{@{}rcl@{}} p_{F}(s_{2}\vert\tau)&=&\frac{1}{\tau^{2}}{\int}_{0}^{\infty}d{{\varDelta}}\tau_{2}{\int}_{0}^{\infty}d{{\varDelta}}\tau_{1} e^{-{{\varDelta}}\tau_{2}/\tau} \\ &&\times\left[2{\int}_{0}^{\infty}d{{\varDelta}} t_{2}{\int}_{\max({{\varDelta}} t_{2}-T_{\alpha},0)}^{{{\varDelta}} t_{2}}d{{\varDelta}} t_{1} p_{T}({{\varDelta}} t_{1},{{\varDelta}} t_{2}\vert{{\varDelta}}\tau_{1},{{\varDelta}}\tau_{2})\right. \\ &&+\left.2{\int}_{0}^{T_{\alpha}}d{{\varDelta}} t_{2}{\int}_{0}^{\infty}d{{\varDelta}} t_{1} p_{T}(-{{\varDelta}} t_{1},{{\varDelta}} t_{2}\vert{{\varDelta}}\tau_{1},{{\varDelta}}\tau_{2})\right. \\ &&+\left.2{\int}_{0}^{T_{\alpha}}d{{\varDelta}} t_{2}{\int}_{{{\varDelta}} t_{2}}^{\infty}d{{\varDelta}} t_{1} p_{T}(-{{\varDelta}} t_{1},-{{\varDelta}} t_{2}\vert{{\varDelta}}\tau_{1},{{\varDelta}}\tau_{2})\right]. \end{array} $$

(6)

The three integrals over the arrival time intervals in the above expression correspond, respectively, to the cases in which particle 0 arrives first, second, and third. The factors of 2 account for the symmetry, in each case, of swapping the index labels 1 and 2.

3 Results and discussion

Even for such a simple system, the numerical integration required to calculate p_F(s₂|τ) is computationally intensive, with the principal time sink being the repeated evaluations of Eq. 4 for all the different values of the release and arrival time intervals needed to evaluate Eq. 6. We resolved this difficulty by parallelizing the computation, evaluating each instance of p_T(Δt₁,Δt₂|Δτ₁,Δτ₂) on a separate thread of an Nvidia GeForce GTX TITAN GPU with 3,072 cores, 12 GB of RAM, and 1,000 MHz clock speed. This reduced the total computational time by a factor of roughly 1,000. For our integration mesh, we chose a lattice spacing (bin width) of 0.02 time units, and a mesh domain defined in terms of model time units by the inequalities 0 ≤Δτ₁ ≤Δτ₂, 0 ≤Δτ₂ ≤ r, − 10 ≤Δt₁ ≤ r + 10, and − 10 ≤Δt₂ ≤ r + 10, where the integration range r was set equal to 20 time units. For each point on this mesh, the formally infinite upper limit of each of the parallelized time integrals was approximated as 50 time units. These restricted integration ranges were sufficient to approximately normalize all of the probability distributions of the model to within an acceptable tolerance.

After computing the transport probability at each point of the chosen integration mesh, it became tractable to evaluate the integrals over the release and arrival time intervals serially, using an Intel Core i7-2600 CPU with 3.40GHz clock speed and 8 GB of memory. We demonstrate how this computational time varies with integration range r in Fig. 2 for three different bin widths. As the logarithmic scale makes clear, the serial computation time grows roughly exponentially with the integration range. It also scales approximately as an inverse power law of the bin width, with a negative exponent of about 4.

We plot our numerically evaluated probability p_F(s₂|τ) in Fig. 3(A) as a function of the self-assembly time scale T_α for values of the mean release interval τ = 0.5, 1.5, 2.5, and 3.5, in descending order. For all curves, the time scales μ and λ were both fixed at unity. As expected, when self-assembly is instantaneous (T_α = 0), there is no interval of time during which the third molecule can be repelled, so trimer formation is inevitable (p_F(s₂|τ) = 0). At the other extreme, as $T_{\alpha }\rightarrow \infty $, the dimer becomes the only possible product ($p_{F}(s_{2}\vert \tau )\rightarrow 1$). As molecule emissions into the channel become more infrequent (larger τ), the arrival times between molecules will also tend to increase, depressing the dimer probability. These curves are all fit very well by an extended exponential function of the form

$$ p_{F}(s_{2}\vert\tau)\approx 1-\exp\left[-c_{1}(T_{\alpha}/\tau)^{c_{2}}\right], $$

(7)

where c₁ and c₂ are fitting parameters that may depend in a complicated manner upon some dimensionless combination of the time scales τ, μ, and λ. These best fit functions are plotted as solid curves over the numerical data in Fig. 3(A).

Figure 3(B) also plots the final dimer probability versus T_α, but this time τ is held fixed at τ = 1.5 and μ and λ are varied instead. The top curve is the same as the second to top curve in panel (A) (μ = λ = 1). The remaining curves are, in descending order, for (μ,λ) = (2,2), (2,1), and (4,2). These curves illustrate several general trends. First, the dimer probability decreases monotonically with increasing μ, reflecting the fact that a less facilitated channel will tend to space out the arrival times of the molecules, making trimer formation more likely. Increasing λ tends to have the opposite effect, since reducing the diffusivity of the channel narrows the distribution of arrival times (Eq. 3), resulting in a less noisy channel. The variance of the inverse Gaussian distribution is μ³/λ, explaining why p_F(s₂|τ) has a stronger dependence on μ than on λ. These curves are well modeled by the same class of fitting function used in Fig. 3(A).

Perhaps the most informative way of quantifying how self-assembly depends upon our model parameters is with a “phase diagram,” where a relevant parameter subspace is divided into regions based upon the most probable structure in each. For our toy system, this phase diagram is fairly simple and is plotted in Fig. 4(A) as a function of the control parameter τ and the self-assembly time T_α. The transport parameters μ and λ are both fixed at unity. The phase boundary, which turns out to be approximately linear (R²-value of ≈ 0.997), was determined by finding, for each value of τ, the critical value of T_α for which p_F(s₂|τ) = 1/2. For large T_α and small τ, the shorter average interval between particle emissions and the longer assembly time will make it more likely for the third particle to arrive while the first two are still docking, thereby frustrating trimer formation. In the opposite limit, the time between emissions will be long and assembly will occur swiftly–both circumstances that favor the trimer product.

The phase diagram of our toy model naturally becomes more interesting as the number of structural end states increases, but although it is straightforward to generalize Eqs. 2, 4, and 5 to account for N > 3 particles, the computational cost rapidly becomes untenable. Even the next most complicated case of N = 4 taxed our available GPU resources, limiting the accuracy of our numerical integrations as the range of possible departure and arrival time intervals increased with growing τ. To circumvent this, we simulated the drift-diffusion process directly as a one-dimensional Gaussian-distributed random walk and computed the probabilities p_F(s|τ) as the fraction of replicate simulations whose set of first-passage times corresponded in our model to the end state s. Formally, the transport process in our model is a continuous Wiener process, which is only equivalent to a Gaussian random walk in the limit that the discrete time step of the walk goes to zero, but we found that a time step of dt = 0.001 and 10⁵ replicate simulations was sufficient to produce quantitative agreement with the N = 3 results we computed through numerical integration. The computational time was substantially lowered by this approach as well.

The phase diagram computed from our simulations for the N = 4 case is plotted in Fig. 4(B). With four molecules being released into our one-dimensional channel, a tetrameric end state becomes possible in addition to the dimer and trimer structures of the previous case, and we see from the figure that all three final oligomers have a region of parameter space where they are statistically favored. The boundaries separating the regions are both approximately linear. Note that while in the three-particle case the phase diagram regions represent ranges of the parameters τ and T_α for which one structure is the majority product (p_F(s_i|τ) > 0.5), in the case of four or more particles, the regions only denote which product subsumes a plurality of the probability (p_F(s_i|τ) > p_F(s_j|τ), ∀j≠i).

To better portray the prevalence of each structural end state at different values of τ and T_α, we plot a trio of heat maps in Fig. 5 that use color gradients to indicate the probability of each oligomer across the entire phase diagram of Fig. 4(B). While the dimer and tetramer (first and third plots, respectively) are only highly probable near their limited regions of dominance, the trimer (middle plot) retains a high likelihood across almost the entire phase diagram, falling off only in the upper-left corner, where $T_{\alpha }\rightarrow \infty $ and dimer formation becomes certain, and the lower-right corner, where $\tau \rightarrow \infty $ and tetramer formation is guaranteed. This wide-ranging favorability of the trimer structure is a consequence of two principal factors.

The first is the fact that there are two distinct trimer formation sequences, not counting permutations in the molecular labels, compared with only one each for the dimer and tetramer. Specifically, the trimer will result if the difference in the second and third arrival times or the third and fourth arrival times is less than T_α. For the dimer, both of those time differences must be less than T_α, whereas for the tetramer neither must be. This enables the trimer structure to have a substantial likelihood of formation even in the regions where it is not the most favored oligomer.

The second factor contributing to the trimer’s dominance can be understood from considering the distribution of the random variable Δt_i −Δt_i− 1, the difference in the arrival times of two consecutively released molecules. The mean of this random variable is τ, the average difference in their release times, but the variance is equal to 2μ³/λ + (2i − 1)τ². (To understand why this is the case, note that Δt_i −Δt_i− 1 can be rewritten in terms of first passage times and release time intervals as (t_i − τ_i) − (t_i− 1 − τ_i− 1) + Δτ_i −Δτ_i− 1.) When τ is very small, the standard deviation approaches a constant value, making it not improbable for one pair of consecutive arrival times to exceed T_α, so long as T_α ≪ (2μ³/λ)^1/2. Similarly, as τ gets large, the standard deviation tends to grow faster than the mean, making it not unlikely that one pair of consecutive arrival times will be less than T_α. These circumstances allow for a substantial trimer probability to bleed into the regions of dimer and tetramer dominance.

4 Conclusions

While the framework we have devised for quantifying self-assembly in terms of individual molecular interactions is quite general, we have seen even in the simple case of our three-molecule toy model that its computational cost is problematic, especially were one to extend it to the self-assembly of long biopolymers like proteins or microtubules. The number of dynamical variables that characterize the set of interacting states {ϕ} will necessarily grow linearly with the number of molecules considered, but the real problem is that an integral like that in Eq. 4 will have to be evaluated for every permissible set of values these variables can take. The number of integrations will thus grow exponentially with the number of molecules, rendering even parallelization schemes unfeasible for supramolecular assemblies consisting of more than a handful of subunits.

For systems where direct simulation of the dynamics is similarly intractable, the most straightforward way to address this problem is to make physically sensible approximations that constrain the hypervolumetric domain of the variables {ϕ}, thereby reducing the number of integrals that must be computed in parallel. In our N = 3 toy model, for example, we must consider a range of Δt₁ and Δt₂ values broad enough to allow for 3! = 6 different interaction orders. If we work in the large τ limit, however, we can assume that the probability of nonconsecutively released particles interacting in reversed order is negligible. This reduces the number of permissible interaction orderings to three (removing the factor of two from the second term on the right of Eq. 6 and deleting the third term entirely) and eliminates the need to consider negative values of Δt₂. This is only a modest gain, but when we extend our toy model to include tetrameric structures, this approximation scheme reduces the number of allowable orderings from 4! = 24 to a paltry five.

Another way to reduce the computational costs of our modeling framework is to take advantage of the fact that the conditional probabilities at its core are not limited to describing physics at the scale of individual molecular interactions. In experiments performed by Freer et al. [16], for example, nanofluidics and electrophoresis were combined to direct nanowires to self-assemble across designated electrode sites. In that context, it would make more sense to treat individual nanowires as the subunits in our model, even though each wire is composed of a large number of atoms. Similarly, the method developed by Nie et al. [17] to generate Janus microparticles by throttling two imiscible liquids through a narrow microfluidic aperture represents a potential application of our theory where the self-assembly would most appropriately be modeled at the continuum fluid level. In both these cases, the physical forces involved are well understood and straightforward to numerically model, and the number of subunits involved and assembled structures possible are reasonably limited. We hope to leverage systems like these in the future to validate our methods against real experimental results.

References

Knowles TPJ, Vendruscolo M, Dobson CM (2014) The amyloid state and its association with protein misfolding diseases. Nat Rev Mol Cell Biol 15:384–396
Article Google Scholar
Chatterjee S, Bandyopadhyay A, Sarkar K (2011) Effect of iron oxide and gold nanoparticles on bacterial growth leading towards biological application. J Nanobiotechnology 9(34):1–7
Google Scholar
Jing Y, Trefna H, Persson M, Kasemo B, Svedhem S (2014) Formation of supported lipid bilayers on silica: relation to lipid phase transition temperature and liposome size. Soft Matter 10:187–195
Article Google Scholar
Jin X, Riedel-Kruse IH (2018) Biofilm Lithography enables high-resolution cell patterning via optogenetic adhesin expression. PNAS 115(14):3698–3703
Article Google Scholar
Jang J, Hong S, Schatz GC, Ratner MA (2001) Self-assembly of ink molecules in dip-pen nanolithography: A diffusion model. J Chem Phys 115(6):2721–2729
Article Google Scholar
Chakrabarty R, Mukherjee PS, Stang PJ (2011) Supramolecular coordination: Self-assembly of finite two- and three-dimensional ensembles. Chem Rev 111:6810–6918
Article Google Scholar
Hiraoka S (2018) Unresolved issues that remain in molecular self-assembly. Bull Chem Soc Jpn 91:957–978
Article Google Scholar
Biancaniello PL, Kim AJ, Crocker JC (2005) Colloidal interactions and self-assembly using DNA hybridization. Phys Rev Lett 94:058302
Article Google Scholar
Xin SH, Wang PD, Yin A, Kim C, Dobrowolska M, Merz JL, Furdyna JK (1996) Formation of self-assembling CdSe quantum dots on ZnSe by molecular beam epitaxy. Appl Phys Lett 69:3884
Article Google Scholar
Jahn A, Vreeland WN, Gaitan M, Locascio LE (2004) Controlled vesicle self-assembly in microfluidic channels with hydrodynamic focusing. J Am Chem Soc 126:2674–2675
Article Google Scholar
Israelachvili JN, Mitchell DJ, Ninham BW (1976) Theory of self-assembly of hydrocarbon amphiphiles into micelles and bilayers. J Chem Soc Faraday Trans 2(72):1525–1568
Article Google Scholar
Nagarajan R, Ruckenstein E (1991) Theory of surfactant self-assembly: A predictive molecular thermodynamic approach. Langmuir 7:2934–2969
Article Google Scholar
Sweeney B, Zhang T, Schwartz R (2008) Exploring the parameter space of complex self-assembly through virus capsid models. Biophys J 94:772–783
Article Google Scholar
Pilkiewicz KR, Rana P, Mayo ML, Ghosh P (2019) Self-assembly from a single-molecule perspective. In: Proceedings of the 11th EAI international conference on bio-inspired and communications technologies
Folks JL, Chhikara RS (1978) The inverse gaussian distribution and its statistical application–a review. J R Stat Soc Ser B Methodol 40(3):263–289
MathSciNet Google Scholar
Freer EM, Grachev O, Duan X, Martin S, Stumbo DP (2010) High-yield self-limiting single-nanowire assembly with dielectrophoresis. Nat Nanotechnol 5:525–530
Article Google Scholar
Nie Z, Li W, Seo M, Xu S, Kumacheva E (2006) Janus and ternary particles generated by microfluidic synthesis: Design, synthesis, and self-assembly. J Am Chem Soc 128:9408–9412
Article Google Scholar

Download references

Acknowledgements

The authors express gratitude to Dr. Elizabeth Ferguson, Technical Director of the U.S. Army ERDC Installations and Operational Environments Program, for support of this research.

Author information

Authors and Affiliations

U.S. Army Engineer Research and Development Center, Vicksburg, Mississippi, 39180, USA
Kevin R. Pilkiewicz & Michael L. Mayo
Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
Pratip Rana & Preetam Ghosh

Authors

Kevin R. Pilkiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Pratip Rana
View author publications
You can also search for this author in PubMed Google Scholar
Michael L. Mayo
View author publications
You can also search for this author in PubMed Google Scholar
Preetam Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kevin R. Pilkiewicz.

Additional information

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The tests described and the resulting data presented herein, unless otherwise noted, were obtained from research conducted under the Installations and Operational Environments Program of the U.S. Army Corps of Engineers by the U.S. Army Engineer Research and Development Center (ERDC). This work was also partially supported by NSF-CBET 1802588. The views and opinions expressed are those of the authors and do not reflect the official guidance or position of the United States Government, US Department of Defense, or the US Army Corps of Engineers.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pilkiewicz, K.R., Rana, P., Mayo, M.L. et al. Predicting Supramolecular Structure from the Statistics of Individual Molecular Events. Mobile Netw Appl 28, 491–498 (2023). https://doi.org/10.1007/s11036-022-02057-8

Download citation

Accepted: 25 October 2022
Published: 01 December 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11036-022-02057-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predicting Supramolecular Structure from the Statistics of Individual Molecular Events

Abstract

Similar content being viewed by others

Self-Assembly from a Single-Molecule Perspective

Kinetic constraints on self-assembly into closed supramolecular structures

Mapping the energy landscapes of supramolecular assembly by thermal hysteresis

1 Introduction

2 The model

3 Results and discussion

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Open Access

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predicting Supramolecular Structure from the Statistics of Individual Molecular Events

Abstract

Similar content being viewed by others

Self-Assembly from a Single-Molecule Perspective

Kinetic constraints on self-assembly into closed supramolecular structures

Mapping the energy landscapes of supramolecular assembly by thermal hysteresis

1 Introduction

2 The model

3 Results and discussion

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Open Access

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation