1 Introduction

Free energy is a thermodynamic potential derived from the first and second laws of thermodynamics. The first law, which affirms the conservation of energy, is expressed as \({\text{d}}U=\delta Q-\delta W\), where \({\text{d}}U\) is the infinitesimal change in the internal energy of the system, \(\delta Q\) is the infinitesimal heat transfer to the system, and \(\delta W\) is the infinitesimal work done by the system on its surroundings. Rudolf Clausius (1822–1888) realized that although the amount of heat \(\delta Q\) exchanged between two systems lacks the properties of an exact differential, the ratio \(\delta Q/T\), where \(T\) is the absolute temperature, is in fact an exact differential. This insight supported the definition of entropy \(S\) as a state function (Clausius 1865), contributed to refining the second law of thermodynamics, and made the differential change in internal energy expressible as \({\text{d}}U=T{\text{d}}S-p{\text{d}}V\), where \(p\) and \({\text{d}}V\) represent the pressure and the infinitesimal volume change, respectively. In the late nineteenth century, scientists used insights from the first and second laws of thermodynamics, along with Legendre transformations, to identify several thermodynamic potentials. Among these was free energy, which was formulated to facilitate the determination of thermodynamic equilibrium and to allow the calculation of equilibrium constants in chemical reactions.Footnote 1 The Helmholtz free energy (Helmholtz 1882a, b, 1883) and the Gibbs free energy (Gibbs 1876, 1878) are particularly well known and widely used examples.Footnote 2 The former, \(F=U-TS\), is significant for its role in evaluating the maximum work (other than expansion work) achievable in constant-temperature, constant-volume scenarios. On the other hand, the Gibbs free energy, \(G=U+pV-TS\) (also known as free enthalpy in some contexts), predicts equilibrium and describes reaction spontaneity and phase transitions under constant temperature and pressure conditions. Together, these potentials illustrate the versatility of thermodynamics in the strategic control of chemical reactions, phase transitions, and energy transformations across a broad spectrum of applications.

The contributions of Ludwig Boltzmann (1844–1906) and J. Willard Gibbs (1839–1903) introduced a complementary perspective through statistical mechanics.Footnote 3 Gibbs’ introduction of the concepts of the canonical ensemble and the partition function (Gibbs 1902), along with Boltzmann’s foundational work on the statistical interpretation of entropy (e.g., Boltzmann 1872, 1877), provided a comprehensive theoretical framework for statistical mechanics.Footnote 4 Indeed, thermodynamics and statistical mechanics together offer a robust framework for the analysis of condensed matter systems. Thermodynamics addresses the macroscopic behavior of systems through laws that describe energy exchanges and transformations without detailing the microscopic constituents. In contrast, statistical mechanics reveals the microscopic underpinnings of these macroscopic phenomena by using probabilistic models to understand the collective behavior of particles and their emergent properties. This approach complements thermal measurements and enhances predictive capabilities based on microscopic states. The integration of these two disciplines bridges the macroscopic and microscopic realms and provides a comprehensive understanding of equilibrium and non-equilibrium states, phase transitions, and the fundamental interactions of matter and energy.

Fundamental to the symbiotic relationship between thermodynamics and statistical mechanics is the principle of free-energy minimization, which, coupled with entropy maximization, emphasizes the predisposition of systems toward thermodynamic equilibrium and stability. Such principles bridge the macroscopic laws of thermodynamics with the predictions afforded by statistical mechanics. Crucially, Nernst’s principle, often referred to as the third law of thermodynamics, sharpens this scheme by demonstrating that the entropy of a perfectly ordered crystal approaches zero as the temperature reaches absolute zero. This law establishes a definitive empirical basis for free-energy calculations across various conditions.Footnote 5 The integration of the internal energy function, \(U\), serves as the keystone for the empirical derivation of free energy, a process that is tailored to the type of free energy under consideration. For the Helmholtz free energy, the focus is on the tendency of the system to maximize entropy within these constraints, thus aiming for minimized free-energy states. \(F\) is derived from the internal energy with adjustments for the temperature effect on entropy. Conversely, \(G\) is obtained from the enthalpy (\(H=U+pV\)) with further adjustments for the temperature dependence of the entropy. The process of integrating \(U\) with respect to these variables—temperature and volume for \(F\), and pressure and temperature for \(G\)—yields the respective free energies. Following this integration process, the resulting free energy, whether \(F\) or \(G\), determines the system’s capacity to perform work under specified conditions and serves as a determinant of its thermodynamic stability and phase behavior.

Building on these fundamental principles, the challenge intensifies when confronted with non-ideal systems, where simplified assumptions such as negligible intermolecular forces no longer hold. Unlike ideal gasses, complex systems present interactions and behaviors that defy simple mathematical modeling. To illustrate this, consider the differential of the Helmholtz free energy:

$$d\left(\beta F\right)=-\beta p{\text{d}}V+{\left(\frac{\partial \left(\beta F\right)}{\partial \beta }\right)}_{V}{\text{d}}\beta ,$$
(1)

which highlights the dependency of free-energy changes on both volume and \(\beta \). Here, \(\beta =1/({k}_{{\text{B}}}T)\), where \({k}_{B}\) is the Boltzmann constant, introduces a reciprocal temperature scale, effectively inverting the conventional temperature dependency.Footnote 6 Therefore, the change in \(\beta F\) along an equilibrium path from an initial state \(({\beta }_{1},{V}_{1})\) and a final state \(({\beta }_{2},{V}_{2})\) is given by:

$$\Delta \left(\beta F\right)=\int_{\left({\beta }_{1},{V}_{1}\right)}^{\left({\beta }_{2},{V}_{2}\right)}-\beta p\left(\beta ,V\right)dV+U\left(\beta ,V\right){\text{d}}\beta ,$$
(2)

which emphasizes the dual influence of pressure and internal energy across these states. Equation (2) illustrates the intrinsic complexities encountered in the theoretical determination of thermodynamic properties for non-ideal systems. The term \(-\beta p(\beta ,V){\text{d}}V\) encompasses the work done by the system during a volume change, which in non-ideal systems is influenced by complex intermolecular forces. These forces vary non-linearly with distance and can be repulsive or attractive depending on the specific conditions (e.g., temperature, pressure, composition). The need to integrate these effects over a trajectory connecting two equilibrium states requires a subtle understanding of such forces and their effects on system behavior that goes far beyond the simplicity of ideal gas laws, where such forces are ignored. In addition, the integration of \(U(\beta ,V){\text{d}}\beta \) requires an accurate characterization of the internal energy of the system as a function of volume and temperature (inversely related to \(\beta \)). In non-ideal systems, the internal energy includes not just the kinetic energy of the particles, but also the potential energy resulting from intermolecular interactions; these interactions are often temperature dependent and can lead to phenomena such as phase transitions, which pose significant hurdles to theoretical modeling. This whole process has historically been fraught with difficulties (see, e.g., Kirkwood 1935), a task that becomes exponentially more complex as the number of particles and their interactions increase.

In such cases, experimental methods have gained the upper hand. This preference stems from the direct and controlled measurement of \(p\) and \(U\), which can be measured with high precision under laboratory conditions. The process obviously requires control over the temperature and volume of the system to ensure that the conditions necessary for the accurate calculation of Helmholtz free energy are maintained. This level of control, typically achieved through the use of thermostats for temperature control and rigid containers for volume maintenance, simplifies the experimental procedure by providing a stable environment in which the behavior of the system can be observed and measured without the confounding factors present in uncontrolled settings. Once accurate measurements of \(p\) and \(U\) are secured, the experimentalist proceeds by thermodynamic integration, changing either the temperature or the volume of the system in small, controlled increments while continuously monitoring the changes in pressure and internal energy at each step. The trajectory of these changes forms the integral path connecting the initial and final states of the system in \((\beta ,V)\) space. Through this mapping, it is possible to numerically integrate the empirical data collected along this path, and apply Eq. (2) to derive the change in free energy between the two equilibrium states.

The introduction of molecular simulations in the 1950s marked a shift in the study of complex systems (Battimelli et al. 2020). These computational approaches, which include Monte Carlo (MC) methods and molecular dynamics (MD), transformed the ability to analyze and predict molecular behavior from a theoretical point of view, providing insight into the structural, thermodynamic, and kinetic properties of materials as well as biological molecules. The MC method, based on Metropolis et al. (1953), uses a stochastic approach to explore the configuration space of a system by probabilistically generating and accepting or rejecting configurations based on their statistical weight, with the original Metropolis algorithm employing the Boltzmann distribution to directly sample system configurations. This approach allows thermodynamic quantities, including free energy, to be calculated directly from the partition function. On the other hand, MD provides a dynamic perspective by tracing the evolution of systems through time (Alder and Wainwright 1957, 1958). By integrating the Newtonian equations of motion for particles, using potential energy functions like the Lennard–Jones (LJ) and Coulombic potentials to model interparticle forces, MD simulations yield time-dependent trajectories that elucidate the behavior of the system at the microscopic level. From these trajectories, the energy function and its differential coefficients, such as heat capacity and compressibility, can be derived, giving a detailed view of the system’s response to changes in external conditions. Collectively, MC and MD simulations facilitate the computation of energy functions and their derivatives and illuminate the behavior of the system from both a thermodynamic and a statistical mechanical perspective, especially when experimental data are lacking or incomplete. This symbiosis of MC and MD methods with traditional approaches enhances the capacity to unravel the complex interplay of forces within materials and biological entities, effectively bridging theoretical models with empirical realities.Footnote 7

In the realm of molecular simulations, free energy is particularly important in two different types of problems (Frenkel and Smit 2023, p. 266). The first class concerns the assessment of phase stability in macroscopic systems. Traditional simulation techniques are often limited in this respect, especially for first-order phase transitions; in these scenarios, the natural occurrence of phase transitions can happen at such a low rate that accumulating sufficient statistical data to determine the probabilities of the system being in different phases becomes a daunting challenge. As a result, the use of free-energy calculations becomes indispensable and allows researchers to evaluate and compare the stability of different phases when direct observation and conventional sampling methods prove inadequate.Footnote 8 The second area is the study of energy barriers that exist between two (meta)stable states within a system; by analyzing the height and shape of these barriers, scientists gain insight into how often and under what conditions a system might transition from one state to another. This is particularly relevant in cases where the rate of these transitions is so low that direct observation is not possible.

Free-energy calculations through molecular simulations began to gain prominence within the domain of condensed matter in the late 1960s, sparking a period characterized by fervent scientific inquiry and computational ingenuity (see also Ciccotti et al. 1987, pp. 80–84). This embryonic stage in simulation techniques was driven by both ambition and the inherent constraints of the period’s technological capabilities. Mirroring the chronological unfolding of these computational efforts, this article describes the historical and methodological innovations from the early efforts to the turning point represented by the introduction of umbrella sampling (Torrie and Valleau 1977). This demarcation is not simply a temporal checkpoint but a methodological one, which represented a shift in the capability to explore free-energy landscapes; the advent of umbrella sampling signaled a new epoch in simulation methodology, one that offered a robust response to the sampling challenges that had long plagued the field. This methodological zenith stands as a natural culmination point for this retrospective, encapsulating a defining era of discovery and innovation that has indelibly shaped the contours of molecular simulations in condensed matter physics. The conclusion of the essay at this point is intentional and emphasizes the significance of umbrella sampling as a cornerstone technique that continues to inform contemporary scientific endeavors.

The historiography of molecular simulation techniques, as explored in fields such as theoretical physics, physical chemistry, and beyond, remains an area characterized by a dearth of comprehensive studies. The existing literature provides a fragmented tableau of the evolution of the field. In this landscape, the work of Battimelli et al. (2020) represents a major effort, providing a comprehensive and insightful historical analysis that extends beyond the confines of free-energy calculations to the broader field of molecular simulation. Thus, while invaluable for its sweeping perspective, it leaves open the possibility for more focused investigations of free-energy calculations. Other contributions are found interspersed within the broader scientific corpus, with Ciccotti et al. (1987), Chipot et al. (2007), and Frenkel and Smit (2023) emerging as influential examples.Footnote 9 While their works are primarily scientific in nature, they provide relevant insights into the historical context; however, they do not engage in a comprehensive historical discourse. The field is, thus, at a critical crossroads, where the need for a detailed and systematic exploration of its trajectory is historiographically evident. This article is a step in the effort to analyze the development of molecular simulations in condensed matter from a historical perspective; it attempts to examine the fundamental theories, computational methods, and interplay of scientific concepts that have collectively guided the field’s development. The choice of umbrella sampling as a terminus is also practical, given the extensive material scope. A future article will cover developments from the late 1970s to the 97th course of the International School of Physics “Enrico Fermi” in Varenna, Italy, from July 23 to August 2, 1985, entitled “Molecular-Dynamics Simulation of Statistical-Mechanical Systems” (Ciccotti and Hoover 1986). This academic gathering stood as a manifesto for molecular simulations and signaled the formal integration of this domain into condensed matter research (Battimelli et al. 2020, pp. 180–182). Recognizing the expansive scope of this scholarly pursuit, this paper seeks to contribute a distinct piece to the historical narrative of the discipline.

2 The era of computational realities

In the mid-twentieth century, physics and chemistry were catalyzed by the emerging field of molecular simulations; this computational renaissance initiated in a methodological shift that reshaped the analysis of material properties (Battimelli et al. 2020). The first efforts in this area were characterized by the goal of studying the equations of state for diverse substances with the goal of encapsulating the complex behaviors of matter within precise, predictive mathematical frameworks. Initial expositions of many-body systems provided the bedrock for such explorations (Ciccotti et al. 1987, pp. 4–9). The advent of the Metropolis algorithm (Metropolis et al. 1953) was a transformative event that propelled the MC method beyond its initial confines and into the broader realm of statistical mechanics and thermodynamics.Footnote 10 Over time, the application of MC methods was extended (e.g., Rosenbluth and Rosenbluth 1954), eventually leading to remarkable developments in the techniques for constructing and analyzing phase diagrams. For example, Wood and Parker (1957) used MC methods to calculate compressibility factors and isochore properties, comparing them with experimental data for argon. Their work, which included mapping the coexistence lines between gas–liquid and solid–fluid phases contributed to the collective effort within the scientific community to expand the understanding of phase transitions (see, e.g., Wood and Jacobson 1957).

In this era of scientific ferment, a symposium on the statistical mechanics theory of transport properties was organized by Ilya Prigogine (1917–2003) in Brussels in 1956 (Prigogine 1958). It was at this meeting that Berni Alder (1925–2020) and Thomas Wainwright (1927–2007) presented their pioneering study that marked the beginning of MD (Alder and Wainwright 1957, 1958). Their inquiry into the phase transitions of hard spheres, designed to mirror the interactions of atoms in a condensed phase through non-penetrable, contact-only interactions, stripped away the layers of atomic complexity. Bolstered by the coding expertise of Mary Ann Mansigh (born 1932), their approach provided a vivid window into the atomic interactions within condensed matter, effectively narrowing the gap between theoretical constructs and tangible observations. In a departure from the MC approach, MD represented a shift toward a deterministic simulation of particle motion.Footnote 11 Indeed, while the Metropolis algorithm relies on probabilistic approaches to sample state space and is instrumental in equilibrium statistical mechanics, MD simulates the Newtonian dynamics of particles. This method also allows for the direct observation of time-dependent phenomena and offers a dynamic picture of particle interactions and the evolution of systems over time, whereas the Metropolis algorithm is primarily adept at static characterizations, providing snapshots of systems at equilibrium. The innovation brought by Alder and Wainwright lay in the temporal resolution of MD simulations, which made it possible to study kinetic processes, transport properties, and non-equilibrium states, enriching the field of computational physics with a new lens through which to view the microcosm of molecular interactions.

In this context, Wood and Jacobson (1957) set up a comparative analysis between the MC and MD methods; by recalculating the equation of state for hard spheres using MC simulations and comparing their results with MD results from Alder and Wainwright (1957), their study engaged in a form of dialectical analysis. The goal was to reconcile the discrepancies between these two computational methods to contribute to a better understanding of phase transitions in hard-sphere systems, and to the reliability of these computational techniques in simulating physical systems.Footnote 12 Their study revealed a remarkable phenomenon: within a certain density range, the system exhibited two distinct yet intersecting branches in the equation of state. This duality permitted the system to oscillate between two states at different pressure levels. A first-order phase transition was “strongly suggested,” as evidenced by the abrupt escalation of pressure values observed as the simulation progressed (Wood and Jacobson 1957, p. 1207). This finding provided compelling support for the occurrence of solid-phase transitions in hard spheres, especially at higher densities. Their outcomes, particularly with respect to phase behavior under varying densities and the implications for solid-phase transitions, highlighted the power of the MC method in demystifying complex phenomena.

The consolidation of MD as a distinct scientific discipline was particularly transformed by the contributions of Aneesur Rahman (1927–1987). In his computational investigation of condensed matter dynamics, Rahman’s work charted the microstates of liquid argon, facilitating a comprehensive examination of interatomic forces modulated by thermal fluctuations (Rahman 1964). This research provided a validation of MD simulations and demonstrated their ability to reproduce and predict the behavior and thermodynamic properties of liquid systems with remarkable fidelity; his work helped cement the reputation of MD simulations as a viable tool for exploring and understanding the complex dynamics characteristic of real-world fluid systems. Such developments broadened the methodological repertoire for scientists across disciplines and enabled a computational analysis of the subtle energetics within different phases of matter. Rahman’s efforts were paralleled by Loup Verlet’s (1931–2019) numerical strategy for the temporal evolution of particle systems (Verlet 1967). Focusing on an algorithm for numerical integration of Newton’s equations of motion, Verlet’s method computed particle positions at a new time step using positions from two previous time steps. This approach omitted direct velocity calculation, increasing computational efficiency and numerical stability for extended simulations. The simplicity of the algorithm and the minimization of computational resources provided a robust computational framework for representing molecular trajectories, using a two-body LJ potential to simulate systems such as argon with remarkable fidelity to experimental data.

As the tapestry of twentieth-century scientific inquiry unfolded, the evolution of computational methods in physics and chemistry transcended a simple linear progression to reveal the dynamic interplay of theoretical insights, experimental challenges, and computational innovations. The pioneering forays into molecular simulation represent a rich fabric of cross-disciplinary fertilization, technological breakthroughs, and methodological evolution.

For the first time, thanks to the computer, it became possible to study the evolution in time of systems consisting of thousands—or even millions—of elements. This allowed scientists to simulate the behavior of real macroscopic objects, and predict their properties. And so a new science was born, molecular simulation, of which molecular dynamics is the most accomplished achievement, whose protagonists try to leave behind, once again, a “closed world” and open the door to an “infinite Universe”, where we have the potential to fully reconstruct the mechanisms explaining real macroscopic systems. Indeed, with molecular dynamics, one can “calculate theory”, and hence simulate and predict the observable behavior of real systems (including chemical and biological systems), using as input only the laws of physics and its fundamental constants (Battimelli et al. 2020, p. 3).

The journey toward greater realism in molecular simulations represents a profound shift and bridged the gap between abstract theoretical principles and their tangible computational representations. Within this narrative, the contributions of scholars such as Rahman are epicenters within a vast network of scientific dialogue and exploration, heralding a shift toward “computational realities”—a term that encapsulates the journey toward unprecedented realism in the simulation of molecular behavior. This transition opened a new epoch in the study of matter at the atomic and molecular levels, representing a leap in the ability to simulate, understand, and predict the behavior of matter with remarkable precision (see, e.g., Kapral and Ciccotti 2005; Battimelli et al. 2020).

In the midst of this evolving landscape, the results presented in Barker and Watts (1969) represent another example of the confluence of theoretical rigor and computational innovation in the quest for realism in molecular simulations. The duo undertook a computational study to unravel the properties of liquid water at the molecular level; using the MC technique, their research aimed to calculate the internal energy, specific heat, and radial distribution function of liquid water at 25 °C using an intermolecular pair potential inspired by John Shipley Rowlinson’s (1926–2018) analysis of ice and water vapor (Rowlinson 1951). Their model bypassed non-physical configurations to ensure a more rigorous representation of molecular interactions.Footnote 13 This was achieved by incorporating computational techniques to model the complex forces between water molecules, preventing any overlap of opposite charges and reflecting a true-to-life representation of molecular behavior in liquid water. Remarkably, their calculated values for internal energy and specific heat were “strikingly good” and in close agreement with experimental data, a feat made all the more substantial by the lack of adjustable parameters in their model (Barker and Watts 1969, p. 145). Yet, “[t]he agreement with experiment for the radial distribution function”, they emphasized, “is not outstanding but we believe that these results are sufficiently good to establish the feasibility of this approach to water” (Barker and Watts 1969, p. 145). This work, however, added to the arsenal of computational techniques, while marking a stride in the ability of molecular simulation methods to address the challenges of understanding the molecular dynamics, phase behavior, and the anomalous properties of water that had puzzled scientists for decades.Footnote 14

3 Early simulations of phase transitions in particle systems

In analyzing the history of molecular simulations, particularly as a prelude to the emergence of free-energy calculations, a reassessment of the contributions of Alder and Wainwright (1962) is both prudent and relevant. While their research did not focus on free-energy calculations, it signaled a transformative shift within the field of computational statistical mechanics and laid the foundation upon which subsequent developments in the discipline would be built. Their research provided what is now considered as the first computational demonstration of phase behavior in a system, capturing the transition between distinct phases.Footnote 15 This ability to delineate and simulate phase transitions is critical to free-energy calculations, which are fundamentally concerned with the energy changes associated with phase changes. A retrospective analysis, beginning with this article, is necessary to contextualize the developmental trajectory that culminated in the advent of free-energy calculations through molecular simulation (Ciccotti et al. 1987, p. 81).

In their study, Alder and Wainwright introduced a simulation technique to locate the coexistence point of a first-order transition.Footnote 16 Their investigation focused on a two-dimensional hard-disk system; this study, conducted using the LARC computer located at the Lawrence Radiation Laboratory, broke new ground by analyzing a system of 870 particles and processing 200,000 collisions per hour, a crucial step in locating the system’s melting point.Footnote 17 During their investigation, they observed a distinct loop in the system’s isotherm, similar to the classical van der Waals loop, a key phenomenon in identifying and understanding the dynamics of phase transitions (Fig. 1).Footnote 18

Fig. 1
figure 1

The equation of state for a system of hard disks undergoing a phase transition, showing the system density versus normalized pressure. The van der Waals loop indicates the coexistence of solid and liquid phases under the same pressure conditions. The vertical lines represent the range of pressure fluctuations observed during the simulations. See Alder and Wainwright (1962), p. 360

To infer phase coexistence, they applied the “equal area” rule, a method that involves segmenting this loop into equal areas to determine phase densities.Footnote 19 While conceptually simple, this approach required long simulation times and large sample sizes, reflecting the computational limitations of the time. Their method, while groundbreaking, faced relevant challenges, particularly in the area of thermodynamic integration; the primary problem was the non-reversible nature of the path in certain regions of the phase diagram, which led to inaccuracies in the determination of key thermodynamic quantities. In addition, the extension to three dimensions (simulating systems with up to 500 hard spheres) revealed pronounced hysteresis effects; in this situation, a binary phase existence was observed (Wood et al. 1958; Alder and Wainwright 1960), where the entire system would remain in either a fluid or crystalline state over extensive sequences of particle collisions. This phase behavior highlighted the inability of the system to sustain mixed-phase conditions under the simulation parameters employed.

[I]n the largest three-dimensional system investigated with the improved program (500 hard spheres), the particles were either all in the fluid phase or all in the crystalline phase. The system would typically remain in one phase for many collisions. The occasional shift from one phase to the other would be accompanied by a change of pressure. The equation of state was represented by two disconnected branches overlapping in the density range of the transition, since with the limited number of phase interchanges, it was not possible to average the two branches (Alder and Wainwright 1962, p. 359).

The representation of the system’s equation of state by two distinct, non-continuous branches within the density spectrum of the phase transition emphasized the discrete nature of these phase states; the overlap of these branches within the transition density region, without a feasible method to merge them due to the sparse occurrence of phase transitions, underlined a critical limitation in capturing the continuum of thermodynamic states that would characterize a more comprehensive phase diagram. These challenges required subsequent researchers to develop novel strategies for accurately determining melting points in complex systems.

In examining the limitations inherent in the study of Alder and Wainwright (1962), one must consider the broader context of computational physics and molecular simulation as it existed in the early 1960s. Their research, while pioneering in its approach and execution, encountered several hurdles, both technological and theoretical, that narrowed its scope and applicability. The computing power available at the time, exemplified by the LARC computer, imposed constraints on the size and complexity of the simulations that could be performed. Processing 200,000 collisions per hour, while impressive, was a limiting factor in the study of larger or more complex systems; this constraint was particularly evident in the quest for accurate and reliable data on phase transition dynamics, where the size and duration of simulations are critical. In addition, the methodological approach adopted by Alder and Wainwright, primarily the use of the “equal area” rule to infer phase coexistence, was strategic but not without its drawbacks; the simplicity of this method belied the challenges in its practical application, especially with respect to the non-reversible nature of the path in certain regions of the phase diagram. This problem represented a major obstacle to the proper determination of key thermodynamic quantities and, as seen, was exacerbated in three-dimensional systems where hysteresis effects became pronounced. Thus, the transition from two-dimensional to three-dimensional simulations represented a remarkable escalation in complexity that Alder and Wainwright’s methodology was not fully equipped to handle. In addition, while the observational strategy employed in their study offered valuable insights into phase-transition dynamics through the analysis of isotherms, it was somewhat limited in its ability to provide a comprehensive understanding of overall phase behavior; phase transitions in more complex systems often exhibit a number of subtleties (e.g., critical point phenomena, multicomponent systems, and solid–solid transitions) that the techniques employed by Alder and Wainwright were not designed to capture in their entirety.

The research presented in Hoover and Ree (1967) contributed novel methodologies that complemented and extended the foundational work of Alder and Wainwright (1962). In their quest for more detailed studies of phase transitions, William Hoover (born 1936) and Francis Ree (1936–2020) introduced approaches that allowed them to deal with the challenges associated with non-reversible phase trajectories. They emphasized the impact of surface effects in small systems on the reversibility of phase transitions, an aspect that needed more careful attention (Hoover and Ree 1967, pp. 4873–4874; Mayer and Wood 1965). Indeed, surface phenomena in confined systems can lead to non-reversible phase paths, making it difficult to properly simulate melting processes and phase stability.

Modern computers can accurately simulate the behavior of idealized systems of several hundred particles but they have trouble in studying the melting process in which small-system surface effects make the transition irreversible. It is here suggested that a thermodynamically reversible path linking the solid and fluid phases can be obtained using a periodic “external field” to stabilize the solid phase at low density. The properties of the artificially stabilized solid at low density are studied theoretically and two practical schemes are outlined for determining the melting parameters using computer-calculated entropies (Hoover and Ree 1967, p. 4873).

The introduction of such a periodic external field provided a refined set of tools for studying phase transitions; this approach altered the potential energy landscape of the system and enabled the stabilization of the solid phase under conditions where it would not naturally persist.

The first scheme involved preventing the melting transition by applying an external field that stabilized the solid phase at all densities, creating an “artificial solid.” The second approach allowed the melting transition to occur gradually and reversibly by first using the infinite-strength external field to expand the solid to a low density and then gradually reducing the field strength to allow the system to “melt” in a controlled manner. Both schemes allowed the precise calculation of the solid-phase entropy (Hoover and Ree 1967, p. 4874).

Hoover and Ree’s artificial solid was realized by confining each particle of the system within its own cell essentially mimicking the effect of an infinitely strong external field (Hoover and Ree 1967, p. 4874). In this artificially confined environment, particles could collide with both the walls of their cells and neighboring particles, which effectively extended the solid phase throughout the density range. At higher densities, the particles were predominantly confined by their neighbors, maintaining the properties of a perfect solid. Conversely, at lower densities, collisions with cell walls became significant, thus preventing the artificial solid from melting. In other words, the artificial solid remained ordered within the lattice of individual cells, even under conditions where a natural solid phase would not exist. Crucially, in the lower density regimes where traditional computational models failed to maintain the solid phase, this methodological approach proved to be an efficient response to the irreversible nature of phase transitions observed in smaller systems. Hoover and Ree’s ingenuity in this regard was to circumvent this limitation and create a controlled environment conducive to the reversible study of phase transitions, while at the same time allowing a more detailed study of melting processes and thermodynamic transitions. By stabilizing the solid phase and facilitating a reversible melting process in their simulations, the two were able to calculate the entropy changes associated with phase transitions while establishing a framework that contributed to the accuracy of phase-behavior analysis.

Figure 2 provides a graphical representation of the excess entropy as a function of density at fixed energy, effectively delineating the transitions between the artificial solid, solid, and glassy states of the system. This figure is key to understanding the effectiveness of the applied external field in modulating the phase behavior of the system, and in particular illustrates the implementation and impact of the artificial solid framework on the reversibility of phase transitions.Footnote 20 Figure 2 demonstrates the ability of the method to maintain solid-phase stability over a varied density spectrum and highlights the transition mechanisms facilitated by this novel approach. While Hoover and Ree’s primary focus was on establishing a reversible pathway for phase transitions through solid-phase stabilization, the implications of their work for subsequent free-energy calculations and phase-behavior analyses are profound. Their work strategically refined the understanding of phase stability within a computational framework that allowed precise delineation of the melting transition, promoting a thermodynamically reversible pathway that was critical for quantifying the entropy changes associated with phase transitions. Their approach, however, provided optimal results under conditions approximating near-harmonic potentials, but was presumably hampered in accurately characterizing systems with hard-core repulsions or solid phases that exhibit significant anharmonic behavior within their mechanical stability limits (Ciccotti et al. 1987, p. 82).

Fig. 2
figure 2

Excess entropy as a function of density at fixed energy, illustrating the transitions between artificial solid, solid, and glassy states in Hoover and Ree’s study. The dashed line represents the artificially stabilized solid phase under the influence of an external field, demonstrating the methodological innovation of stabilizing the solid phase at different densities and allowing the calculation of entropy changes during phase transitions. See Hoover and Ree (1967), p. 4874

Hansen and Verlet (1969) extended these concepts and applied them to a wider range of force laws and systems. After a fruitful visit to the United States, Verlet began to train a group of young researchers at the Orsay campus of the Sorbonne University (Battimelli et al. 2020, pp. 67–76). In 1969, he and his student Jean-Pierre Hansen (born 1942) transformed the study of phase transitions in systems interacting via the LJ potential by introducing a computational approach, using the UNIVAC 1108 computer, to analyze a system of 864 particles.Footnote 21 Their research, which differed substantially from Hoover and Ree’s 1967 concept of an artificial solid, focused on the structural and dynamic properties of the liquid and solid phases and the transitions between them. Hansen and Verlet’s approach allowed atoms to move freely within a simulated box using periodic boundary conditions.Footnote 22 Using this method, particles mimicked the conditions of an infinite system by crossing the edges of the simulation box and re-emerging on the opposite side as they exited; this feature was essential in preserving the natural dynamics of the system while controlling fluctuations and allowing reliable calculation of the thermodynamic properties of a single-phase system without the interference of phase separation.Footnote 23 Influencing the comprehension of phase transitions within LJ systems, their article “became the seed of countless other research papers and studies of the phase equilibria in fluids by computer simulation techniques” (Rotenberg et al. 2015, p. 2364).

4 Configuration sampling and the rise of free-energy calculations

In molecular simulations of classical many-body systems, the primary inputs for calculating mechanical properties—such as potential energy, kinetic energy, and the stress tensor—are derived from the positions (\({{\varvec{r}}}^{N}\)) and momenta (\({{\varvec{p}}}^{N}\)) of the constituent particles. The accurate modeling of interparticle forces through potential energy functions is essential to faithfully capturing the mechanical properties of a system and ensuring a detailed representation of its mechanical behavior. However, achieving high-fidelity results, especially near phase transitions, presents significant hurdles. This complexity stems from the fact that phase transitions involve abrupt changes in system properties, necessitating precise thermodynamic and statistical mechanical modeling. For this reason, a comprehensive understanding of phase transitions goes beyond mechanical phenomena and requires an in-depth investigation of thermal properties, such as the Helmholtz free energy and Gibbs free energy, and specific heats (\({C}_{v}\), \({C}_{p}\)). Yet, in contrast to mechanical properties, thermal properties pose a greater challenge to derive from simulation data and demand extensive computational efforts to explore the system’s phase space. This exploration entails mapping a vast multidimensional domain that encompasses all conceivable configurations and states of motion of the system.

Indeed, the concept of phase space is of central importance, as it provides a comprehensive framework in which each point represents a unique microstate of the system, characterized by the positions and momenta of all particles. These microstates cumulatively manifest as macroscopic observables through ensemble averaging, a process that correlates the probabilistic distribution of microstates with observable properties at the macroscopic level. In this context, free energy delineates energetically favorable regions within phase space, which is crucial for understanding the thermodynamic stability and phase behavior of systems because it highlights the energetically accessible configurations that contribute to the observable behavior of the system (see also Frenkel and Smit 2023, pp. 274–275). The primary challenge in the study of phase transitions lies in the need for exhaustive sampling of these microstates to accurately characterize the transitions; given the vast dimensionality of phase space, this task requires the use of advanced computational algorithms coupled with a deep understanding of statistical ensembles. Moreover, bridging the gap between mechanical stability and thermal dynamics in the analysis of phase transitions demands substantial computational resources. These requirements underline the interdisciplinary challenge of accurately simulating phase transitions, where mechanical and thermal properties intertwine to define the behavior of the system under different thermodynamic conditions.

In the mid-1960s, Konrad Singer (1917–2013) and his postdoctoral fellow Ian McDonald (1938–2020) at Royal Holloway College, University of London, made major inroads into improving (Metropolis et al. 1953).Footnote 24 Their efforts were summarized in a letter to Nature in which Singer described two methods for analyzing the thermodynamic properties of simple fluids (Singer 1966). The first method, attributed to Singer, used a histogram-based MC approach designed for a 32-particle system of gaseous argon. This approach involved constructing a histogram to capture the system’s potential energy distribution, which facilitated the estimation of the classical configurational partition function, \({Z}_{c}(V,T,N)\), across a varied temperature range. Singer’s formula for numerical analysis was given as:

$${Z}_{c}\left(V,T,N\right)=\frac{{V}^{N}}{N!}\int_{{\phi }_{min}}^{{\phi }_{max}}{e}^{-\frac{\phi }{{k}_{B}T}}f\left(\phi \right){\text{d}}\phi .$$
(3)

Here, \(N\) denotes the number of particles, while \(f(\phi ){\text{d}}\phi \) is the frequency of configurations within a specified potential energy range, \(\phi \), reflecting the potential energy landscape of the system. The parameters \({\phi }_{{\text{min}}}\) and \({\phi }_{{\text{max}}}\) delineate the range of potential energy under investigation. The execution of this method involved a traversal of the configuration space to generate a histogram that effectively mapped the potential energy landscape, allowing an approximation of \({Z}_{c}(V,T,N)\) as:

$${Z}_{c}\left(V,T,N\right)=C\sum_{n}{e}^{-\frac{{\phi }_{n}}{{k}_{{\text{B}}}T}}{F}_{n}.$$
(4)

In Eq. (4), \({\phi }_{n}\) identifies the midpoint potential energy value within each interval, and \({F}_{n}\) quantifies the frequency of configurations occurring within those intervals. The normalization coefficient \(C\) ensured that the estimated partition function was consistent with experimental data.

The second method, using McDonald’s insights, was applied to a 108-particle system of liquid argon and introduced a re-weighting technique for dynamically exploring thermodynamic properties over a temperature range using single simulation data. This method provided a computational framework that dynamically used simulation data to estimate thermodynamic properties at different temperatures (e.g., internal energy, specific heat, isothermal compressibility), extending the utility of MC simulations beyond the static models traditionally employed (Singer 1966, p. 1449). At the core of the re-weighting technique was the adaptation of configuration weights, useful for dynamically estimating thermodynamic properties from a single dataset, encapsulated by the formula:

$${G}_{n}\left(T\right)={e}^{-\frac{{\phi }_{n}}{{k}_{{\text{B}}}T}}{F}_{n}.$$
(5)

Specifically, Eq. (5) determines the Boltzmann-weighted frequency, \({G}_{n}(T)\), of configurations with a specific potential energy, \({\phi }_{n}\), at temperature \(T\), laying the groundwork for how initial temperature configurations contribute to thermodynamic properties. Building on this foundation, the re-weighting step for adjusting configurations to estimate properties at temperatures proximal to \(T\) is given by:

$${G}_{n}\left({T}^{\prime}\right)={G}_{n}\left(T\right)\cdot {e}^{-\frac{{\phi }_{n}}{{k}_{B}}\left(\frac{1}{{T}^{\prime}}-\frac{1}{T}\right)}.$$
(6)

Through Eq. (6), \({G}_{n}({T}^{\prime})\) recalibrates the weights at a target temperature \({T}^{\prime}\), for a range of temperatures adjacent to \(T\). This recalibration ensured that the initial weights \({G}_{n}(T)\) reflected the thermal energy distribution at new temperatures \({T}^{\prime}\), facilitating a thorough and dynamic evaluation of the configurational partition function and other related thermodynamic properties within this temperature range. “Calculations for condensed phases of interacting particles,” Singer noted, “appear to be quite practicable” (Singer 1966, p. 1449).

The comprehensive research report appeared in McDonald and Singer (1967a, b) and triggered a wave of research. In McDonald and Singer (1967a), the researchers introduced an approach to the thermodynamic analysis of liquid argon using MC simulations parameterized by the LJ potential. Their methodology was characterized by the precision with which they evaluated a wide range of thermodynamic properties—in particular, pressure, internal energy, heat capacities, latent heats, compressibility, and thermal expansion coefficients—over a wide range of temperatures, from the triple point to the liquid–gas critical point, and for varying densities. Central to their study, McDonald and Singer employed a re-weighting technique for extrapolating thermodynamic data across various temperatures. As noted above, this method adjusted the probability distribution of the sampled configurations to simulate system behavior under different conditions without requiring new simulations. Complementarily, they introduced an isothermal extrapolation procedure to mathematically extend MC simulation data which allowed the estimation of thermodynamic properties under slightly varied volumes. The synergistic use of re-weighting and isothermal extrapolation enhanced the efficiency of MC simulations; their results, when compared with experimental data, showed broad agreement and underlined the reliability of MC sequences in producing statistically robust results. “Calculations based on an MC method and an LJ-potential,” they observed,

have been shown to yield quantitatively significant values of thermodynamic properties of a simple liquid (argon) over a considerable \(V\), \(T\) range, including phase transitions. Satisfactory results have been obtained for the internal energy. For the pressure, the agreement with experimental values is less good but still, on the whole, satisfactory. For the derivatives of \({U}^{\dagger}\) [molar configurational internal energy] and \(P\), the agreement with experiment is better than qualitative (McDonald and Singer 1967a, p. 47).

Indeed, the authors acknowledged certain discrepancies between their results and experimental data (McDonald and Singer 1967a, p. 44, p. 47) and postulated that these discrepancies might be due either to the inherent limitations of the MC simulation methodology—specifically, the constraints imposed by finite sampling sizes and the implementation of periodic boundary conditions—or to deficiencies within the LJ potential function employed. They also expressed the expectation that further investigation would elucidate the predominant factors contributing to these observed inconsistencies. “The computing time required to obtain results of the type reported in this paper is large but not prohibitive,” they emphasized. “MC calculations for more complex systems, e.g., mixtures of simple fluids or systems governed by non-central forces, need no longer be regarded as impractical” (McDonald and Singer 1967a, p. 47).

In McDonald and Singer (1967b), the duo elucidated the thermodynamic properties of gaseous argon over four different densities within the temperature interval from – 100° to 150 °C. Their methodology again centered on the histogram-based approach coupled with a re-weighting technique; however, a notable feature of their simulation design was the implementation of overlapping energy ranges that ensured seamless integration across adjacent histograms, as also noted in (McDonald and Singer 1967a, p. 43). Such overlaps facilitated the effective application of the re-weighting method, which uses the inverse Boltzmann factor to recalibrate the histograms, and allowed for a broader temperature range to be included in the study. Despite achieving substantial agreement with empirical data, they noted rapid variations in the weight function relative to the potential energy, signaling potential challenges in extending this method to condensed phases. Nevertheless, their results confirmed that “calculations for argon with a Lennard–Jones potential function have shown that at gaseous densities, even at high pressures, reliable values of thermodynamic properties may be obtained with a system of only \(N=32\) molecules” (McDonald and Singer 1967b, p. 4772).

In 1969, Singer’s laboratory further expanded its scope of investigation by examining the adequacy of the LJ potential for liquid argon through MC calculations (McDonald and Singer 1969).Footnote 25 This study, which built upon the foundational work presented in Wood and Parker (1957); Wood et al. (1958); and Wood (1968), optimized the potential parameters for argon and demonstrated the efficacy of the LJ potential as an effective model. The research scope of Singer’s group expanded substantially with their foray into binary fluid mixtures (Singer 1969; McDonald 1969).Footnote 26 In this regard, Singer (1969) outlines a study of the calculation of excess free energy, volume, and enthalpy in binary mixtures of LJ liquids.Footnote 27 At the core of this approach was the application of the aforementioned extrapolation procedure (Singer 1969, p. 164), while comparing the results of his MC simulations with the predictions made by three different implementations of APM theory.Footnote 28 Such a comparative analysis proved useful in assessing the fidelity and applicability of his MC method (Singer 1969, p. 166), as well as in evaluating the effectiveness of APM theory in predicting the behavior of non-ideal liquid mixtures.Footnote 29 Around the same time, a collaborative effort was undertaken by Singer and his wife, Jean Singer (née Longstaff, 1928–1990). Singer and Singer (1970) broke new ground by introducing a method for calculating the Gibbs free energy of mixing in binary mixtures of LJ liquids through MC simulations. This work established a direct correlation between microscopic molecular parameters, such as size and intermolecular energy, and the thermodynamic properties of these mixtures. Particularly, it highlighted the influence of these parameters on excess thermodynamic functions, especially the Gibbs free energy of mixing (Singer and Singer 1970, p. 280, p. 283). A subsequent paper (Singer and Singer 1972) extended this investigation and ventured into the realm of deriving macroscopic properties from intermolecular pair potentials.

By the early 1970s, Konrad Singer had established a prominent research group devoted to molecular simulation (Fig. 3) and broadened its focus to include MD. The laboratory’s efforts were not without logistical challenges; the initial lack of a dedicated computer at Royal Holloway College necessitated reliance on the Atlas Laboratory in Chilton, Oxfordshire, with computational tasks performed using Atlas Autocode on paper tape, a process that required laborious editing for any modifications.Footnote 30 The subsequent purchase of a mainframe computer and the switch to Fortran marked a considerable improvement in their computing capabilities, although some parts of the programming still required machine code due to performance constraints.Footnote 31

Fig. 3
figure 3

A photo from around 1974 showing postdoctoral researcher David Adams (born 1947, on the left) at the workstation keyboard and his mentor Konrad Singer examining line printer output at the Royal Holloway College computer facility. The setup included a desktop card reader for input and a large line printer for uppercase-only output. It served several departments, including physics and mathematics. Due to the lack of a card punch, output had to be mailed to users. Image courtesy of Peter Singer from his private collection, used with permission

A representative contribution from this period—although deviating from the main focus of this retrospective—was Leslie Woodcock’s (born 1945) doctoral research under the supervision of Konrad Singer. His study, hailed as the pioneering MC simulation of molten potassium chloride, represented a breakthrough in computational chemistry (Woodcock and Singer 1971) and demonstrated the effective use of molecular simulations in the study of complex ionic systems. Also of note is the contribution of Eveline Gosling (born 1946), in collaboration with McDonald and Konrad Singer, who introduced a novel approach to non-equilibrium MD, focusing on the calculation of shear viscosity in a simple fluid (Gosling et al. 1973). Using the LJ potential to model argon, the results showed remarkable agreement with the theoretical framework provided by the Kubo formula (Gosling et al. 1973, p. 1482) and strengthened the link between MD simulations and classical fluid mechanics principles, particularly through their correlation with Stokes-type relations and the broader context of the Navier–Stokes equations. Indeed, their research set a precedent for the development of NEMD (non-equilibrium MD) and marked a substantial achievement in the field, with concurrent studies in this area (e.g., Ashurst and Hoover 1972).Footnote 32

Meanwhile, Canada emerged as an epicenter of phase-space sampling through the efforts of John Valleau (1932–2020) and his Ph.D. student Damon Card (1941–2014) at the University of Toronto. Valleau’s contributions to this field gained further momentum and recognition through his collaboration with Verlet; their scientific exchange was not limited to mere correspondence, but was enriched by Valleau’s sabbatical at Orsay in 1968 (Battimelli et al. 2020, pp. 113–114). This academic liaison proved decisive for Valleau and yielded insights useful in the preliminary draft of (Hansen and Verlet 1969).Footnote 33 In Valleau and Card (1972), the authors devised a computational strategy, termed “multistage sampling,” for estimating the free energy and entropy of moderate-density particle systems characterized by hard-sphere interactions and Coulombic forces at a single density. This method built on the foundation of Markov chain-based MC techniques by integrating a hierarchical sampling framework.Footnote 34 Their approach optimized the efficiency phase-space exploration, overcoming limitations inherent in direct sampling strategies (e.g., poor convergence in regions of low probability, inefficiency in sampling large or complex systems, difficulties in properly capturing rare events), particularly pertinent to the computational and theoretical constraints of the 1970s.

The heart of the effectiveness of Valleau and Card’s multistage sampling was the determination of the proportionality constant, \({\mu }_{0}\), which calibrates the frequency of energy configuration selections to match the system’s Boltzmann probability distribution. This constant is essential to ensure that the MC sampling accurately reflects the thermodynamic ensemble of interest. Consider the configuration integral for an \(N\)-particle system:

$${Q}_{N}=\int_{-\infty }^{\infty }\Omega \left({U}_{N}\right){\text{exp}}\left(-\frac{{U}_{N}}{{k}_{b}T}\right){\text{d}}{U}_{N},$$
(7)

where \({U}_{N}\) is the system’s potential energy in a given configuration and \(\Omega ({U}_{N})d{U}_{N}\) quantifies the number of accessible microstates within the energy interval \(d{U}_{N}\) around \({U}_{N}\) (Valleau and Card 1972, p. 5457). The Markov chain sampling scheme was designed to choose configurations of energy \({U}_{N}\) with a frequency proportional to:

$$\Omega \left({U}_{N}\right){\text{exp}}\left(-\frac{{U}_{N}}{{k}_{{\text{b}}}T}\right),$$
(8)

and the resulting energy density \({M}_{0}({U}_{N})\) of the MC states obeys the relation:

$${\mu }_{0}{M}_{0}\left({U}_{N}\right)=\Omega \left({U}_{N}\right){\text{exp}}\left(-\frac{{U}_{N}}{{k}_{{\text{b}}}T}\right),$$
(9)

where \({\mu }_{0}\) is unknown. Such a constant governs the frequency with which particular energy configurations are selected to align with a target probability distribution; yet, ascertaining its value posed difficulties, complicating the calculation of \({Q}_{N}\). To improve the MC sampling technique, Valleau and Card were inspired by the results presented in McDonald and Singer (1967b) and introduced the concept of “bridging distributions” \({M}_{i}({U}_{N})\), which were employed to create a series of intermediate probability distributions to facilitate a smooth transition from the initial energy distribution \({M}_{0}({U}_{N})\) to the target Boltzmann-weighted energy distribution \({\Omega }^{*}({U}_{N})\). Indeed, bridging distributions proved to be an efficient way to overcome the challenge of insufficient overlap between \({M}_{0}({U}_{N})\) and \({\Omega }^{*}({U}_{N})\), especially in large systems where direct sampling from \({M}_{0}({U}_{N})\) to \({\Omega }^{*}({U}_{N})\) may not effectively cover the relevant energy ranges. By inserting these intermediate distributions, the two researchers ensured that the MC sampling process could more efficiently explore the system’s configuration space.

The term “multistage” in Valleau and Card’s work reflects the layered approach to energy distribution sampling—a departure from the traditional methods of the time. Prior to this development, MC simulations relied on direct sampling based on Boltzmann probabilities, which limited the system to exploring states that directly reflected their thermodynamic significance. Valleau and Card’s multistage sampling introduced a sequential strategy that adhered to the principles of the Boltzmann distribution while facilitating the integration of disparate energy distributions. This strategic layering allowed the creation of an empirical energy distribution \({\Omega }^{*}({U}_{N})\) that offered an expansive view of the energetic landscape of the system; the interplay between the initial distribution \({M}_{0}({U}_{N})\) and the empirical \({\Omega }^{*}({U}_{N})\) allowed for a more refined free-energy estimate by analyzing the overlap between these distributions.Footnote 35 In essence, the generation of energy distributions was crucial for calculating the absolute volume of the configuration space and thus determining the configuration integral (Valleau and Card 1972, p. 5457).Footnote 36 By systematically incorporating bridging distributions into the sequential sampling process, Valleau and Card effectively bridged the gap between low and high energy states. This allowed for a uniform exploration of configuration space across the energy spectrum. The bridging distributions, designed to ensure continuity across energy levels, allowed the configuration integral to be estimated with unprecedented precision; the absolute volume of the configuration space for each energy level was determined by aggregating the contributions of all relevant bridging distributions, which provided a detailed understanding of the thermodynamic behavior of the system. As a result, the empirical energy distribution \({\Omega }^{*}({U}_{N})\) derived from this multistage approach offered a granular representation of the system’s potential energy configurations and improved the accuracy of the configurational partition-function estimation and, in turn, the accuracy of free-energy calculations.

The authors applied their method to study an electrically neutral mixture of singly charged hard spheres: for smaller systems (\(N=32\) and 64), there was sufficient overlap between \({M}_{0}({U}_{N})\) and \({\Omega }^{*}({U}_{N})\), eliminating the need for further sampling. For larger systems (\(N=200\)), these two distributions were well separated, necessitating a bridging distribution, \({M}_{1}({U}_{N})\) that provided continuity and overlap (Valleau and Card 1972, p. 5459). While the selection of the functional form for these bridging distributions was not constrained to a specific type, Valleau and Card opted for exponential weighting functions, akin to the Boltzmann-weighted distributions. This choice was advantageous as it allowed the use of existing sampling techniques without necessitating the development of new methods:

$${\mu }_{i}^{*}{M}_{i}({U}_{N})={\Omega }^{*}({U}_{N})\exp(-{U}_{N}/{\alpha }_{i}{k}_{b}T)$$
(10)

where \({\alpha }_{i}>0\) was chosen to ensure sufficient overlap. By comparing the areas under the curves of these distributions over a range of energies, the proportionality constants \({\mu }_{i}^{*}\) could be determined (Valleau and Card 1972, p. 5459, pp. 5461–5462). This approach exploited the detailed knowledge of the shape of the energy distributions generated in the MC runs and allowed the estimation of the proportionality constant. At this point, the knowledge of \({\mu }_{i}^{*}\) enabled the calculation of the free energy as it provided the necessary scaling to determine the absolute value of \({Q}_{N}\). Their method proved to be “fairly straightforward and evidently successful” (Valleau and Card 1972, p. 5461).

Insightful applications soon followed. Grenfell Patey (born 1948) and Glenn Torrie (born 1949), who were also part of the group during this period, played a role in validating and refining the multistage sampling approach. In Patey and Valleau (1973), for example, the authors applied the multistage sampling method to the MC estimation of the free energy of a fluid composed of hard spheres with embedded point dipoles, a fluid that was at that time “the subject of a good deal of theoretical interest” (Patey and Valleau 1973, p. 297). They compared their results with those obtained by thermodynamic integration techniques and theoretical models, demonstrating the efficiency of multistage sampling in this context.

In addition, Torrie and Valleau (1974) introduced an MC technique for estimating the free energies of fluids that addressed the limitations of traditional Boltzmann distribution-based MC simulations. The core of their method involved a special form of “importance sampling” that differed from conventional MC techniques using a non-standard, non-Boltzmann probability function. In their method, Torrie and Valleau used a weighting function within their sampling strategy to intentionally bias the focus toward configurations that were typically underrepresented in standard simulations; this intentional bias was directed toward regions of configuration space characterized by rapid changes in the function of interest. To maintain statistical reliability, a mathematical correction method was applied to compensate for this non-Boltzmann bias and ensure adequate statistical representation.Footnote 37 Their study also explained the synergy between importance sampling and multistage sampling (Torrie and Valleau 1974, pp. 579–580); their combined application ensured a layered and more comprehensive traversal of configuration space, optimizing the reliability of free-energy calculations by biasing sampling toward regions of significant thermodynamic contribution, while maintaining statistical rigor through subsequent correction of introduced biases. Indeed, the duo demonstrated the efficacy of their technique by evaluating the free energy of an LJ fluid in its liquid–vapor coexistence phase (Torrie and Valleau 1974, pp. 580–581); this choice was strategic because the inverse-twelve (soft sphere) fluid does not exhibit condensation and provided a more manageable system for comparison.Footnote 38

5 Overlapping ensembles and non-physical distributions

At a time of rapid developments in molecular simulations, researchers were consistently confronted with the dual challenge of achieving computational viability while ensuring scientific rigor. Traditional methods, as discussed previously, often found themselves navigating the delicate balance between computational demands and the need for a detailed representation of molecular interactions. Against this backdrop, in 1975, IBM researcher Charles Bennett (born 1943) introduced a concept that would alleviate these long-standing challenges. He proposed the mass tensor—a mathematical construct, envisioned for MD studies, defined as an arbitrary positive-definite symmetric matrix (Bennett 1975).Footnote 39 This approach was designed to refine the kinetic energy function of a system of point masses in a way that preserved the equilibrium properties of the system. Bennett’s proposal was a reimagining of how dynamic molecular systems could be modeled.Footnote 40 His approach was characterized by replacing the standard kinetic energy expression, typically denoted as:

$$\frac{1}{2}\stackrel{N}{\sum_{i=1}}{m}_{i}{\stackrel{\cdot }{q}}_{i}^{2},$$
(11)

with a more general quadratic form represented by the expression:

$$\frac{1}{2}\stackrel{N}{\sum_{i,j=1}}{\stackrel{\cdot }{q}}_{i}{M}_{ij}{\stackrel{\cdot }{q}}_{j,}$$
(12)

where \(N\) represents the total number of particles in the system, \({q}_{i}\) is the \(i\)th Cartesian coordinate, \({m}_{i}\) is the mass of the\(i\)-th particle, and \({M}_{ij}\) is the mass tensor element that couples the motions of particles \(i\) and \(j\). By implementing this modification, Bennett was able to slow down high-frequency motions and speed up low-frequency ones; this adjustment greatly increased the efficiency of exploring the system’s configuration space, achieving more thorough investigations within constrained computational timeframes. Indeed, when tested on an LJ polymer chain, his results indicated that “the mass tensor method accomplishes about ten times as much configurational change per time step as ordinary dynamics, or about five times as much per second of computer time” (Bennett 1975, p. 276).

Building on the foundational work of Valleau and his group, in 1976, Bennett presented a comprehensive study aimed at unraveling the complexities of estimating free-energy differences between two canonical ensembles using MC data (Bennett 1976).Footnote 41 His research introduced novel methods that were optimized based on several factors: the amount of overlap between the two ensembles, the smoothness of the density of states as a function of the difference potential, and the relative MC sampling cost per statistically independent data point (Bennett 1976, p. 245). This framework represented a departure from the prevailing methods of the time, which were predominantly rooted in perturbation theory (see, e.g., Zwanzig 1954; Smith 1973), numerical integration (Hoover and Ree 1967, 1968; Hansen and Verlet 1969; Squire and Hoover 1969), and techniques that used intermediate reference systems to bridge distributions (e.g., Valleau and Card 1972; Torrie and Valleau 1974). Unlike these earlier strategies, which were often limited in complex systems characterized by states with minimal overlap or required non-linear transformations for thorough analysis, Bennett’s framework facilitated direct comparison between two canonical ensembles. This approach, which avoided dependence on perturbative assumptions or the need to construct bridging distributions through intermediate reference systems, allowed for an intuitive and straightforward estimation of free-energy differences. In contrast to the approach presented in Torrie and Valleau (1974), which modified MC sampling to target specific configuration spaces via importance sampling and was distinguished by its ability to estimate free energies for fluid systems, Bennett broadened the scope of applications. He provided a more generalized and versatile toolkit for the direct, and more accurate, estimation of free-energy differences across a broad range of systems and transformations and enhanced the methodological rigor and applicability in studies of phase transitions and chemical reactions.

At the beginning of his article, Bennett set the stage for understanding the computational challenges facing MD and MC simulations in the mid-1970s. He elucidated the prevailing dilemma: the direct computation of thermal properties from canonical or microcanonical ensembles was a formidable task for classical systems, given their extensive degrees of freedom. This conundrum, symbolic of the computational limitations of the time, serves as the historical context in which Bennett’s methods emerged and offered innovative solutions that would change the landscape of computational statistical mechanics.

In general, the free energy of a Monte Carlo (MC) or molecular dynamics (MD) system can be determined only by a procedure analogous to calorimetry, i.e., by establishing a reversible path between the system of interest and some reference system of known free energy. “Computer calorimetry” has a considerable advantage over laboratory calorimetry in that the reference system may differ from the system of interest not only in its thermodynamic state variables but also in its Hamiltonian, thereby making possible a much wider variety of reference systems and reversible paths. […] Whether the calorimetric path has one step or many, one eventually faces the statistical problem of extracting from the available data the best estimate of the free energy differences between consecutive systems. Specializing the question somewhat, one might inquire what is the best estimate one can make of the free energy difference between two MC systems (i.e., two canonical ensembles on the same configuration space), given a finite sample of each ensemble (Bennett 1976, pp. 245–246).

His introduction of the “acceptance ratio estimator” proved to be a viable methodological answer that reshaped the computational landscape (Bennett 1976, p. 246, p. 250). Using data from two different canonical ensembles, he revealed that the effectiveness of the estimator depended on the overlap between these ensembles.Footnote 42 Bennett also claimed that optimal accuracy in free-energy estimation could be achieved by a balanced allocation of computational resources to both ensembles, thus maximizing the statistical efficiency of the estimation process (Bennett 1976, p. 250). He further elucidated the “interpolation method,” a curve-fitting technique that relied on the smooth functional interdependence between the density of states and potential energy differences within each ensemble (Bennett 1976, pp. 259–261). This approach, aimed at improving the quality of free-energy calculations, exploited the continuity inherent in the density-of-states function across intermediate ensembles for effective interpolation of free-energy differences, especially under conditions of minimal ensemble overlap.Footnote 43 Bennett’s discourse addressed very timely computational challenges and charted a new course for molecular simulation research.

A substantial segment of his article is devoted to the mathematical articulation of the acceptance ratio method. Bennett considered the canonical configurational integral, denoted as:

$$Q=\int {e}^{-U({q}_{1},...,{q}_{N})}d{q}_{1}...d{q}_{N}$$
(13)

where \(U\) is the temperature-scaled potential energy, contingent on the system’s \(N\) configurational degrees of freedom. His research revealed that the ratio of two configurational integrals, \({Q}_{0}\) and \({Q}_{1}\), defined by distinct potential functions \({U}_{0}\) and \({U}_{1}\), could be articulated as a ratio of canonical averages. This relationship is encapsulated by the Metropolis criterion, articulated through the Metropolis function \(M(x)={\text{min}}\{1,{e}^{-x}\}\), where \(x\) represents the energy difference between the new and the old configurations.Footnote 44 Central to the acceptance ratio method is the determination of the probability of accepting changes in the system’s potential energy: these changes, or “potential-switching trial moves,” involve the system transitioning from one potential energy function to another. The key assessment in this process is whether to accept or reject this transition, a decision based on the Metropolis function. These probabilities led to the equation:

$$\frac{{Q}_{0}}{{Q}_{1}}=\frac{\langle M({U}_{0}-{U}_{1}){\rangle }_{1}}{\langle M({U}_{1}-{U}_{0}){\rangle }_{0}},$$
(14)

where the angle brackets denote canonical averages. Equation (14) shows that the ratio of configurational integrals could be estimated by taking the indicated averages over separately-generated samples of the \({U}_{0}\) and \({U}_{1}\) ensembles. In other words, given that the configurational integrals \({Q}_{0}\) and \({Q}_{1}\) are inherently challenging to compute directly, especially for complex systems, Eq. (14) offered an indirect yet robust method to estimate the ratio of these integrals, facilitating the calculation of free-energy differences. Indeed, the term \(\langle M({U}_{0}-{U}_{1}){\rangle }_{1}\) denotes the average of the Metropolis function applied to the potential energy difference \({U}_{0}-{U}_{1}\) taken over samples from the \({U}_{1}\) ensemble. Conversely, \(\langle M({U}_{1}-{U}_{0}){\rangle }_{0}\) is the average of the Metropolis function applied to the potential energy difference \({U}_{1}-{U}_{0}\) evaluated over samples from the \({U}_{0}\) ensemble. The term “acceptance” in this context refers to the probability of accepting a potential-switching trial move between the two ensembles. As underlined, the method evaluates how likely it is for a configuration from one ensemble to be accepted as a valid configuration in the other ensemble, based on the difference in potential energy between the two ensembles. The degree of overlap between the ensembles determines the quality of the canonical averages, which in turn affects the accuracy of the estimator. A pronounced overlap implies that the two ensembles share a significant portion of configuration space, leading to more reliable estimates. Conversely, a minimal overlap might result in less precise estimates, as the canonical averages might not capture the full behavior of the systems in their shared configuration space.

Bennett’s article also provides a thorough examination of the limitations and practical applications of the acceptance ratio method (Bennett 1976, p. 248). This exploration, nestled within the broader scope of his paper, is emblematic of the scientific milieu of the time and gives insight into the state of molecular simulations in 1976. His discourse made it clear that this technique had the ability to facilitate comparisons between systems characterized by different degrees of freedom as well as specialized ensembles used in MC simulations. However, he emphasized that the mean acceptance probabilities for both systems under consideration had to be sufficiently large to allow their determination with a degree of statistical accuracy deemed acceptable, within the limits of an MC simulation of a reasonable duration. This constraint was critical, as it had major implications for the feasibility and reliability of using his method in practical scenarios, particularly where the availability of computational resources and time were limiting factors. In the context of refining the estimation of free energy, Bennett made use of the Fermi function (Bennett 1976, p. 249):

$$f\left(x\right)=\frac{1}{1+{e}^{x}}.$$
(15)

The superiority of the Fermi function in estimating free-energy differences was well-articulated by Bennett, who noted its “softer shoulder” compared to the Metropolis function (Bennett 1976, p. 252). This characteristic results in a narrower distribution of acceptance probabilities and offered enhanced precision in determining the ensemble average acceptance probability derived from a specific dataset. In scenarios characterized by extensive sample sizes, Bennett formulated an optimized acceptance ratio estimator (Bennett 1976, p. 250), which led to the derivation of the equation:

$$\frac{{Q}_{0}}{{Q}_{1}}=\frac{\langle f({U}_{0}-{U}_{1}+C){\rangle }_{1}}{\langle f({U}_{1}-{U}_{0}-C){\rangle }_{0}}{\text{exp}}\left(+C\right).$$
(16)

Here, \(C={\text{ln}}({Q}_{0}{n}_{1}/{Q}_{1}{n}_{0})\) represents a shift constant that minimizes the expected square error, while \({n}_{0}\) and \({n}_{1}\) denote the sample sizes. \(C\) plays a key role in adjusting the origin of one of the potentials and contributes to the rigor of the equation. Bennett’s work clarified that the optimized formula for \({Q}_{0}/{Q}_{1}\) diverged from Eq. (14) primarily in two aspects: the incorporation of the Fermi function and the adjustment of the origin of one of the potentials by the additive constant \(C\), reinforcing the importance of the Fermi function in refining the reliability of free-energy difference estimates.

Significantly, Bennett emphasized that “[m]ost previous overlap methods for determining free energy differences can be regarded as special cases of the acceptance ratio method” (Bennett 1976, p. 264). Previous strategies, while diverse in their specific implementations, essentially aimed to quantify the overlap between ensembles to facilitate free-energy calculations; the acceptance ratio method, by generalizing this core principle into a systematic, formulaic approach, encapsulated these varied tactics within a singular, expansive framework. As a matter of fact, he expressed hesitancy about the approach presented in Valleau and Card (1972):

Apparently the first overlap calculation of a free energy difference between two systems whose potentials had differing “soft” parts was that of Valleau and Card. These authors, interested in determining the thermodynamic properties of a fluid of charged hard spheres as a function of temperature, compared systems whose \(U\) differed by a constant factor, i.e., systems having the same unscaled potential but different temperatures. Their somewhat complicated procedure for for estimating \(\Delta A\) [the free-energy difference] can be recognized as accomplishing the same result, with somewhat less statistical efficiency […] (Bennett 1976, p. 264).

Bennett’s critique of Valleau and Card (1972) reflects a subtle understanding of the trade-offs inherent in free-energy calculation methods. Valleau and Card’s work on overlap calculations for systems with differing soft potentials made a major contribution to computational chemistry, focusing on the thermodynamic properties of charged hard spheres. Their approach, by scaling the potential energy across different temperatures, introduced a methodological innovation by directly addressing the challenge of comparing systems with inherently different interaction strengths. However, Bennett’s hesitation about their method stemmed from concerns about statistical efficiency. His observation highlights a critical aspect of computational methods: the balance between methodological complexity and the statistical robustness of the results. While Valleau and Card’s procedure effectively bridged the free-energy differences between the compared systems, Bennett suggested that this could potentially be achieved with greater statistical efficiency through alternative approaches.

Torrie and Valleau (1977) appeared 4 months after Bennett (1976) and proved to be a key solution to the inherent limitations of conventional MC methods in the field of free-energy calculations; traditional MC techniques, which relied on Boltzmann-weighted sampling, inherently biased exploration toward high-probability regions of the configuration space.Footnote 45 This bias resulted in inadequate sampling of energetically unfavorable but thermodynamically critical regions, which compromised the accuracy of free-energy calculations. Recognizing the “extremely inefficient” nature of the Boltzmann-sampling framework (Torrie and Valleau 1977, p. 187), umbrella sampling was introduced and conceptualized to employ biased, non-physical sampling distributions to enable a more equitable and exhaustive sampling of the configuration space.Footnote 46 Indeed, the methodological shift introduced by this method was the strategic application of weighted sampling functions that allowed for a coherent exploration of a continuum of states within a single, unified simulation framework. This approach contrasted with the earlier discrete simulations proposed by Valleau and Card (1972) and Torrie and Valleau (1974), which did not allow for a seamless, continuous transition between different states of the system within a single simulation. Umbrella sampling, thus, represented a departure that synthesized and “extended such techniques to explore systematically large regions of a phase diagram, applying them to the Lennard–Jones system in a wide range of temperature and pressure including part of the gas–liquid coexistence region” (Torrie and Valleau 1977, p. 187).

Torrie and Valleau used the weighting function \(w({{\varvec{q}}}^{N})\) to alter the probability distribution of system configurations across the regions of configuration space relevant to the physical systems under study. A defining feature of umbrella sampling was its ability to extend the exploration of the reduced energy region, \(\Delta {U}^{*}=\Delta U/{k}_{{\text{b}}}T\), by up to three times compared to what was achievable with standard MC methods.Footnote 47 This development enabled a rigorous determination of probability density function values down to \({10}^{-8}\) (Torrie and Valleau 1977, p. 189). In fact, the essence of umbrella sampling lay in the core mathematical expression:

$$\frac{A(T)}{{k}_{{\text{b}}}T}-\frac{A({T}_{0})}{{k}_{{\text{b}}}{T}_{0}}=-{\text{ln}}\langle {\text{exp}}(-\Delta U^*){\rangle }_{0}=-{\text{ln}}\int_{-\infty }^{\infty }{f}_{0}(\Delta {U}^{*}){\text{exp}}(-\Delta {U}^{*}){\text{d}}\Delta {U}^{*}$$
(17)

Here, \(A(T)\) and \(A({T}_{0})\) are, respectively, the Helmholtz free energies of the system of interest with internal energy \(U({{\varvec{q}}}^{N})\) at temperature \(T\) and the Helmholtz free energies of a reference system with internal energy \({U}_{0}({{\varvec{q}}}^{N})\) at temperature \({T}_{0}\), the angle brackets \(\langle ...{\rangle }_{0}\) denote an ensemble average over the reference system at temperature \({T}_{0}\), and \({f}_{0}(\Delta {U}^{*})\) is the probability density of the reduced energy difference \(\Delta {U}^{*}\) between the system configurations and the reference. Equation (17) represents the logarithm of an integral over all possible energy differences, weighted by the exponential of the negative energy difference; the selection of \(w({{\varvec{q}}}^{N})=W(\Delta {U}^{*})\) was guided by heuristic principles aimed at balancing the exploration of energetically unfavorable regions with the efficient sampling of more probable states (Torrie and Valleau 1977, p. 189).Footnote 48 This balance was key to ensuring that the sampling explored a wider range of energy states compared to conventional MC simulations, while also maintaining sufficient accuracy and statistical relevance in the regions most important for calculating free-energy differences.

Building on Eq. (17), Torrie and Valleau introduced an element that further distinguished their approach: the scaling factor \(\alpha \) (Torrie and Valleau 1977, pp. 189–190). This strength parameter helped to modulate the reduced energy differences and allowed for the estimation of free-energy differences across an expansive range of values. Such a modulation was achieved by scaling the energy differences within the integral of the probability density function \({f}_{0}(\Delta {U}^{*})\). The parameter \(\alpha \), varying between 0 and 1, facilitated a subtle adjustment of the energy landscape—an effective move in deriving free-energy estimates from the probability distribution \({f}_{0}(\Delta {U}^{*})\). Lower values of \(\alpha \) focused on smaller energy differences (i.e., more localized regions of the configuration space), crucial for microstates characterized by low probability yet significant for free-energy calculations; conversely, higher values of \(\alpha \) allowed the simulation to sample over a wider range of energy states, helping to understand the overall behavior of the system by not focusing disproportionately on the high-energy or low-probability regions. The refined equation that reflects this approach is:

$${\left(\frac{A(T)}{{k}_{{\text{b}}}T}\right)}_{\alpha }-\frac{A({T}_{0})}{{k}_{b}{T}_{0}}=-ln\int_{-\infty }^{\infty }{f}_{0}(\Delta {U}^{*}){\text{exp}}(-\alpha \Delta {U}^{*}){\text{d}}\Delta {U}^{*}$$
(18)

Through the application of the weighting function \(w({{\varvec{q}}}^{N})\) and the modulation of \(\alpha \), umbrella sampling effectively bridged the gap between reference and target systems and allowed extensive exploration of the configuration space with unparalleled efficiency.

In their application of their method, Torrie and Valleau examined the liquid–gas phase-transition region of an LJ fluid and demonstrated its applicability across various temperatures and densities within the gas–liquid coexistence curve (Torrie and Valleau 1977, pp. 190–197). In the section detailing their application of the umbrella sampling technique, the authors articulated the rationale of the approach they took to determine the free-energy difference between an LJ fluid and an inverse-twelve soft-sphere fluid at the same temperature, across seven densities on a supercritical isotherm (Torrie and Valleau 1977, pp. 190–192). This was done to study phase-transition regions, which were challenging to study using conventional methods. The soft-sphere configurations were favored by the aforementioned weighting function that enabled the determination of the free-energy difference with remarkable precision; they noted, in particular, that “at the higher densities on the supercritical isotherm the ‘soft-sphere’ reference system and the Lennard–Jones system have sufficiently similar configurations that a single umbrella-sampling experiment is powerful enough to determine \(\Delta A\).” (Torrie and Valleau 1977, pp. 190–191). This was a key insight, because if the reference system shared the same internal-energy function as the system of interest, Torrie and Valleau could simplify (Eq. 17) using this expression:Footnote 49

$$\frac{A(T)}{{k}_{{\text{b}}}T}=\frac{A({T}_{0})}{{k}_{{\text{b}}}{T}_{0}}-{\text{ln}}{\left\langle {\text{exp}}\left[-U\left(\frac{1}{{k}_{b}T}-\frac{1}{{k}_{b}{T}_{0}}\right)\right]\right\rangle }_{0}$$
(19)

or, in terms of a one-dimensional integral over \(dU\),

$$\frac{A(T)}{{k}_{{\text{b}}}T}=\frac{A({T}_{0})}{{k}_{{\text{b}}}{T}_{0}}-{\text{ln}}\int_{-\infty }^{\infty }{f}_{0}(U){\text{exp}}\left[-U\left(\frac{1}{{k}_{b}T}-\frac{1}{{k}_{b}{T}_{0}}\right)\right]{\text{d}}U$$
(20)

The term inside the exponential function in Eq. (19) and Eq. (20) represents the energy adjustment necessary to account for temperature differences between the system of interest at \(T\) and the reference system at \({T}_{0}\), highlighting how internal energy facilitates understanding this thermodynamic change.Footnote 50 Using this new formulation, Torrie and Valleau demonstrated that the free-energy difference between two states—mediated through the internal-energy difference and modulated by the temperature difference—could be calculated with precision. “This is very powerful,” they noted “because the ‘intermediate’ systems that result from multiplying \(-U\) in the exponent by a smaller number (cf. \(\alpha \) \(<\) 1 […]) can now be interpreted as those with temperatures between \(T\) and \({T}_{0}\). A single sampling of \({f}_{0}(U)\) can therefore give the free energy over a whole range of temperatures” (Torrie and Valleau 1977, p. 193).

Their approach involved a comparative analysis in which they evaluated the free energy of the LJ fluid obtained by their technique against established MC simulations presented in notable studies in the field (e.g., Gosling and Singer 1970, 1973; Valleau and Whittington 1973). Table III in their article (Torrie and Valleau 1977, p. 194) offers a comparison that provides an insightful historical sketch and a comprehensive view of umbrella sampling’s efficacy across various states. These calculations were executed on configurations comprising 32 and 108 particles, but the agreement of their results with earlier thermodynamic integration results for larger systems (Hansen and Verlet 1969; Levesque and Verlet 1969; Hansen 1970; McDonald and Singer 1972) supported their earlier hypothesis that the dependence of the free-energy differences on the number of particles was relatively small in dense systems (Torrie and Valleau 1974, p. 578). The observation of minimal \(N\)-dependence confirmed the robustness of the umbrella sampling method and its superiority over traditional techniques, which often required higher particle counts to achieve the same level of precision.Footnote 51 Such efficiency in optimizing smaller systems without compromising accuracy directly tackled a significant limitation inherent in prior models, specifically in terms of computational cost and resource requirements. Figure 4 provides a visual representation of the overall congruence.

Fig. 4
figure 4

Comparative configurational Helmholtz free energy near liquid–gas coexistence for an LJ fluid. Open and closed circles indicate umbrella sampling results and triangles represent data from Hansen and Verlet (1969), Levesque and Verlet (1969), and Hansen (1970). The broken lines represent the fitted model of McDonald and Singer (1972). This figure illustrates the efficiency of umbrella sampling in phase-transition analysis, which closely matched other established methods. See Torrie and Valleau (1977), p. 195

This finding was “very pleasing,” the authors noted, “since it means that good free-energy estimates can be made very economically, where there exist good data for a suitable reference system” (Torrie and Valleau 1977, p. 195). In addition, Table IV of their article (Torrie and Valleau 1977, p. 196) shows the mean internal energy of the LJ fluid, derived by re-weighting the results of the umbrella-sampling simulations. Again, these results were in good agreement with those from previous research, particularly at higher densities (e.g., Levesque and Verlet 1969; McDonald and Singer 1972). The standard deviation of the mean for energies derived from re-weighted umbrella-sampling MC simulations was about 0.02 \(N\epsilon \) (where \(\epsilon \) is the well depth parameter of the LJ potential) for a sequence of 3–5 × \({10}^{5}\) configurations, slightly exceeding that from a Boltzmann-weighted experiment of similar duration (Torrie and Valleau 1977, p. 197).

In the late 1970s, within the context of condensed matter studies, researchers employing MC simulations for free-energy calculations were faced with a choice between Bennett’s methods and the umbrella sampling technique. The practical application of Bennett’s acceptance ratio and interpolation methods relied on two critical assumptions: significant ensemble overlap and a smoothly varying density of states. Achieving these conditions, however, presented distinct challenges in systems with high complexity or those undergoing dramatic changes in thermodynamic parameters (e.g., systems characterized by dense configuration spaces and landscapes filled with steep valleys, saddle points, and local minima). This complexity could severely limit the extent of ensemble overlap, as the probability distributions of the reference and target ensembles may occupy largely disjoint regions of the configuration space. Consequently, without sufficient overlap, the ability of the acceptance ratio method to accurately estimate free-energy differences was undermined. Similarly, the interpolation method’s reliance on a smoothly varying density of states assumed a degree of continuity and predictability in the system’s energy landscape that may not exist in highly complex or dynamically evolving systems. Abrupt changes or discontinuities in the energy landscape could invalidate the smoothness assumption and render the interpolation of free-energy differences across such regions potentially inaccurate.

Despite these challenges, Bennett’s methods adapted well to different thermodynamic conditions, including changes in temperature and density, and proved particularly valuable in scenarios where the energy landscape of the system was relatively well understood and of moderate complexity (e.g., LJ fluid systems, binary mixtures). Delineating the subtleties of thermodynamic behavior, his approaches excelled in systems where the predictability of state transitions facilitated a more refined analysis of phase equilibria and metastable states. In addition, his techniques provided a robust framework for the analysis of thermodynamic states exhibiting gradual changes, where the underlying assumptions of significant ensemble overlap and density of state smoothness were more likely to be met. This was particularly evident in the study of phase transitions in simple fluids, as well as in the melting and freezing processes of simple crystalline structures, where controlled temperature adjustments allowed researchers to accurately assess phase changes while ensuring that the simulations maintained significant ensemble overlap throughout these transformations.

On the other hand, umbrella sampling allowed an effective exploration of the configuration space over a spectrum of temperatures and densities. Developed as a response to circumvent the limitations of traditional Boltzmann sampling techniques, this framework relied on the judicious selection of the weighting function \(w({{\varvec{q}}}^{N})\). However, such a choice was not universally prescriptive and had to be tailored to the specific energy landscape of the system under study. The heuristic nature of this selection process introduced an element of subjectivity and required a deep understanding of the system dynamics to balance exploration of energetically unfavorable regions with efficient sampling of more likely states. Furthermore, the effectiveness of umbrella sampling depended on the assumption that the chosen weighting function and strength parameter could indeed provide a representative sampling across all relevant energy states. This could not be assumed in systems with highly complex energy landscapes or where critical configurations were not well understood (e.g., critical phenomena, amorphous materials, phase transitions in complex fluids). Consequently, the potential for sampling bias or missing significant energy states remained a pertinent consideration. The methodology’s reliance on numerical integration and statistical treatment of sampled data also introduced considerations related to numerical stability and convergence of results; ensuring that the sampling was sufficiently exhaustive to achieve statistical significance without being computationally prohibitive was a delicate balance, particularly in high-dimensional systems. Despite the inherent challenges in the selection of the weighting function \(w({{\varvec{q}}}^{N})\) and ensuring the numerical stability and convergence of results, umbrella sampling showed distinct advantages in exploring complex energy landscapes and proved particularly beneficial in studying systems near critical points of phase transitions, where traditional MC methods encountered sampling inefficiencies. Umbrella sampling’s ability to mitigate these inefficiencies by facilitating a more uniform exploration of the configuration space made it invaluable for investigating scenarios where the energy landscape featured major barriers.Footnote 52

6 Concluding remarks

The advent of molecular simulations defined an epoch in which classical statistical mechanics and nascent computational science converged in a symbiosis that reshaped scientific inquiry. Far from a linear march of advancement, this era was marked an interplay between theoretical innovation and simulation-driven exploration, characterized by the systematic application of digital methods that challenged and redefined conventional scientific practices in condensed matter research and beyond. Through a historical lens, we observe a narrative characterized by methodological undulations—a spectrum of attempts ranging from rudimentary models (e.g., Metropolis et al. 1953; Alder and Wainwright 1957) to realistic representations of molecular dynamics (e.g., Rahman 1964) that attempted to provide enhanced control over theoretical constructs and experimental evidence. Through significant conceptual and methodological challenges, molecular simulations established themselves as essential tools for distilling complex phenomena into accurate, predictive models.

Just as the eyepiece expanded the observational capacity of our human senses and gave us the opportunity to see new worlds and build new theoretical views of nature, computers, thanks to their continuously improving performances, provided a processing power which allows scientists to “see” in ever greater depth what is implicitly contained in their equations, and complete the ambition of physics of explicitly predicting, and hence controlling, the behavior of systems that no one had been able to handle until then. Fundamental laws can at last be used to effectively simulate the real world in all its complexity; one is no longer confined to show that the behavior of the real material world is consistent with general laws, for it can also be demonstrated that this behavior can be foreseen on the basis of those very laws, whose implicit contents can now be fully developed (Battimelli et al. 2020, pp. 192–193).

Indeed, the ability of molecular simulations to reveal the details and dynamics of atomic and molecular structures signaled a shift toward an integrative methodology that harnessed computational power to explore the frontiers of scientific knowledge (see, e.g., Frenkel and Smit 2023, pp. 53–54). This transformation marked a departure from traditional experimentation and theoretical formulation while promoting a synthesis in which theoretical abstraction and observational evidence converged through computational means. It repositioned researchers from mere users of new digital tools to co-creators in a scientific evolution, enriching the predictive power of statistical mechanics. Before the widespread adoption of these fundamental simulations, computational tools were primarily used to compute the results of theoretically elaborated perturbative developments or to address mathematical challenges arising in applied phenomenological contexts (the most fundamental of which was typically continuum dynamics).Footnote 53 Molecular simulations, however, offered a novel platform where computational tools could not just solve, but also discover—enabling researchers to dynamically model complex systems, predict new phenomena, and thus generate hypotheses for further experimental or theoretical investigation. The advent of molecular simulations sparked a methodological renaissance that cultivated a cross-disciplinary field characterized by its proficiency in decoding complex phenomena through digital insights.

From early hard-sphere models (e.g., Alder and Wainwright 1957; Wood and Jacobson 1957) to refined representations of interparticle interactions via the LJ potential (e.g., Hansen and Verlet 1969), and from the genesis of the histogram and re-weighting techniques (Singer 1966; McDonald and Singer 1967a) to the introduction of importance sampling (Torrie and Valleau 1974), the contributions of this period provided a diverse foundation for the emergence of free-energy calculations. Konrad Singer’s group epitomized this spirit in the late 1960s, pioneering methods that transformed the study of phase equilibria and molecular energetics (e.g., McDonald and Singer 1967a, 1967b, 1969; Singer 1969). These methods were ambitious but were constrained by the computational limits of the time (see also Hoover and Ree 1967); the efforts of that era represent a significant struggle in the quest to navigate the vast configuration space of molecular systems (e.g., McDonald and Singer 1967a), especially near first-order phase transitions (e.g., Hansen and Verlet 1969). The work of Bennett and the collaborative efforts of Torrie and Valleau signaled a shift in free-energy calculations (Bennett 1976; Torrie and Valleau 1977) optimizing the use of computational resources and improving efficiency in processing time.

The move to a computer-based methodology heralded the integration of simulations as an indispensable tool for both testing and refining theoretical frameworks (Frenkel and Smit 2023, p. 2). This development facilitated a twofold process of analytical comparison. On the one hand, it enabled the juxtaposition of calculated properties derived from model systems with empirical data from experimental practice and allowed researchers to diagnose and correct inaccuracies within model systems. In addition, it permitted the evaluation of the congruence between the results of the simulations and the predictions of approximate analytical theories applied to identical model systems. Discrepancies observed herein highlight deficiencies in theoretical constructs and position computational simulations as critical “experiments” designed to interrogate, validate, or refute theoretical propositions.Footnote 54 This methodology has empowered researchers to rigorously test and refine theories prior to their application in real-world scenarios while redefining the reliability and applicability of scientific knowledge.

This application of computer simulation is of tremendous importance. It has led to the revision of some very respectable theories, some of them dating back to Boltzmann. And it has changed the way in which we construct new theories. Nowadays it is becoming increasingly rare that a theory is applied to the real world before being tested by computer simulation. The simulation then serves a twofold purpose: it gives the theoretician a feeling for the physics of the problem, and it generates some “exact” results that can be used to test the quality of the theory to be constructed. Computer experiments have become standard practice, to the extent that they now provide the first (and often the last) test of a new theoretical result (Frenkel and Smit 2023, p. 2).

As such, the historical trajectory of free-energy calculations through molecular simulations serves as a prime example of the symbiotic convergence of computational rigor and theoretical innovation and signaled a frontier in which scientific inquiry was viewed as a holistic enterprise. This confluence was not merely additive, but transformative and created a scientific milieu in which the onset of inquiry seamlessly blended theoretical conjecture with computational validation. The emergence of this integrated approach has changed the way theories are conceived, developed, and substantiated by ensuring that theoretical constructs undergo strict computational testing to determine their validity and applicability.

This discourse also illuminates the bidirectional co-evolutionary relationship between science and technology, wherein simulations emerge as central tools for overcoming experimental limitations, and each domain continually informs and refines the trajectory of the other. Indeed,

[i]t may be difficult or impossible to carry out experiments under extremes of temperature and pressure, while a computer simulation of the material in, say, a shock wave, a high-temperature plasma, a nuclear reactor, or a planetary core, would be perfectly feasible. Quite subtle details of molecular motion and structure, for example in heterogeneous catalysis, fast ion conduction, or enzyme action, are difficult to probe experimentally but can be extracted readily from a computer simulation” (Allen and Tildesley 2017, pp. 4–5).

Within this iterative landscape, advances in computational technologies have both mirrored and catalyzed the emergence of novel scientific methodologies (Battimelli et al. 2020, pp. 2–3), while the intensifying demands of scientific inquiry have spurred technological breakthroughs in computing.Footnote 55 Such an analysis increasingly blurs the traditional divide between epistemological and technological domains and facilitates a deeper understanding of the co-construction of knowledge. Engaging with this epistemic tableau requires a reconceptualization of the mechanisms by which knowledge is generated, negotiated, and used, emphasizing the fluidity and interconnectedness inherent in the process of scientific discovery.Footnote 56 From this perspective, the development of free-energy calculations through molecular simulations exemplifies the entanglement of scientific inquiry and technological innovation; this field crystallizes the essence of their symmetrical relationship while broadening the discourse on their mutual reinforcement. By pushing the boundaries of both theoretical understanding and computational capability, the history of free-energy calculations showcases the potential residing at the nexus of epistemic inquiry and technological advancement. Such convergence invites an interdisciplinary cohort of historians, philosophers of science, and scholars of science and technology studies to delve into the expansive depths of this evolving field.