## Abstract

Nonequilibrium information thermodynamics determines the minimum energy dissipation to reliably erase memory under time-symmetric control protocols. We demonstrate that its bounds are tight and so show that the costs overwhelm those implied by Landauer’s energy bound on information erasure. Moreover, in the limit of perfect computation, the costs diverge. The conclusion is that time-asymmetric protocols should be developed for efficient, accurate thermodynamic computing. And, that Landauer’s Stack—the full suite of theoretically-predicted thermodynamic costs—is ready for experimental test and calibration.

## 1 Introduction

In 1961, Landauer identified a fundamental energetic requirement to perform logically-irreversible computations on nonvolatile memory [1]. Focusing on arguably the simplest case—erasing a bit of information—he found that one must supply at least \(k_\text {B}T \ln 2\) work energy (\(\approx 10^{-21} J\) at room temperature), eventually expelling this as heat. (Here, \(k_\text {B}\) is Boltzmann’s constant and *T* is the temperature of the computation’s ambient environment.)

Notably, though still underappreciated, Landauer had identified a thermodynamically-reversible transformation. And so, no entropy actually need be produced—energy is not irrevocably dissipated—at least in the quasistatic, thermodynamically-reversible limit required to meet Landauer’s bound.

Landauer’s original argument appealed to equilibrium statistical mechanics. Since his time, advances in nonequilibrium thermodynamics, though, showed that his bound on the required work follows from a modern version of the Second Law of thermodynamics [2]. (And, when the physical substrate’s dynamics are taken into account, this is the *information processing Second Law* (IPSL) [3].) These modern laws clarified many connections between information processing and thermodynamics, such as dissipation bounds due to system-state coarse-grainings [4], nanoscale information-heat engines [5], the relation of dissipation and fluctuating currents [6], and memory design [7].

Additional scalings recently emerged between computation time, space, reliability, thermodynamic efficiency, and robustness of information storage [8,9,10]. In contrast to Landauer’s bound, these tradeoffs involve thermodynamically-irreversible processes, implying that entropy production and therefore true heat dissipation is generally required depending on either practicality or design goals.

In addition to these tradeoffs, it is now clear that substantial energetic costs are incurred when using logic gates and allied information-processing modules to construct a computer. Especially so, when compared to custom designing hardware to optimally implement a particular computation [11].

Taken altogether these costs constitute a veritable *Landauer’s Stack* of the information-energy requirements for thermodynamic computing. Figure 1 illustrates Landauer’s Stack in the light of historical trends in the thermodynamic costs of performing elementary logic operations in CMOS technology. The units there are joules dissipated per logic operation. We take Landauer’s Stack to be the overhead including Landauer’s bound (\(k_\text {B}T \ln 2\) joules) up to the current (year 2020) energy dissipations *due to information processing*. Thus, the Stack is a hierarchy of energy expenditures that underlie contemporary digital computing—an arena of theoretically-predicted and as-yet unknown thermodynamic phenomena waiting detailed experimental exploration.

To account for spontaneous deviations that arise in small-scale systems, the Second Laws are now most properly expressed by exact equalities on probability distributions of possible energy fluctuations. These are the *fluctuation theorems* [24], from which the original Laws (in fact, inequalities) can be readily recovered. Augmenting the Stack, fluctuation theorems apply directly to information processing, elucidating further thermodynamic restrictions and structure [25,26,27,28].

The result is a rather more complete accounting for the energetic costs of thermodynamic computation, captured in the refined Landauer’s Stack of Fig. 1. In this spirit, here we report new bounds on the work required to compute in the very important case of computations driven externally by time-symmetric control protocols [12]. In surprising contrast to the finite energy cost of erasure identified by Landauer, here we demonstrate that the scaling of the minimum required energy *diverges as a function of accuracy* and so can dominate Landauer’s Stack. This serves the main goal in the following to validate and demonstrate the tightness of Ref. [12]’s thermodynamic bounds and do so in Landauer’s original setting of information erasure.

In essence, our argument is as follows. Energy dissipation in thermodynamic transformations is strongly related to entropy production. The fluctuation theorems establish that entropy production depends on both forward and reverse dynamics. Thus, when determining bounds on dissipation in thermodynamic computing, one has to examine both when the control protocol is applied in forward and reverse. By considering time-symmetric protocols we substantially augment Landauer and Bennett’s dissipation bound on logical irreversibility [29] with dissipation due to logical nonselfinvertibility (aka nonreciprocity). Our results therefore complement recent work on the consequences of logical and thermodynamic reversibility [30]. Parallel work on thermodynamic bounds for information processing in finite time, and bit-erasure in particular, include the use of optimized control in the linear response regime [31,32,33] and transport theory [34,35,36]. However, the cost of nonreciprocity necessarily goes beyond the cost of finite-time computing, because time-symmetrically driven computations incur this additional dissipation regardless of the rate at which they’re executed.

Why time-symmetric protocols? Modern digital computers are driven by sinusoidal line voltages and square-wave clock pulses. These control signals function as control parameters, directly altering the energetics and therefore guiding dynamics of the computer components. Being time-symmetric control signals, modern digital computers must then obey Ref. [12]’s error-dissipation trade-off. Moreover, the costs apply to even the most basic of computational tasks—such as bit erasure. Here, we present protocols for time-symmetrically implementing erasure in two different frameworks and demonstrate that both satisfy the new bounds. Moreover, many protocols approach the bounds quite closely, indicating that they may in fact be broadly achievable.

After a brief review of the general theory, we begin with an analysis of erasure implemented with the simple framework of two-state rate equations, demonstrating the validity of the bound for different protocols of increasing reliability. We then expand our framework to fully simulated collections of particles erased in an underdamped Langevin double-well potential, seeing the same faithfulness to the bound for a wide variety of different erasure protocols. We conclude with a call for follow-on efforts to analyze even more efficient computing that can arise from *time-asymmetric* protocols.

## 2 Dissipation in Thermodynamic Computing

Consider a universe consisting of a computing device—the *system under study* (SUS), a *thermal environment* at fixed inverse temperature \(\beta = 1 / k_\text {B} T\), and a *laboratory device* (lab) that includes a *work reservoir*. The set of possible microstates for the SUS is denoted \(\varvec{{\mathcal {S}}}\), with \(s\) denoting an individual SUS microstate. The SUS is driven by a *control parameter* \(x\) generated by the lab. The SUS is also in contact with the thermal environment.

The overall evolution occurs from time \(t = 0\) to \(t = \tau \) and is determined by two components. The first is the SUS’s Hamiltonian \({\mathcal {H}}_{SL}(s, x)\) that specifies its interaction with the lab device and determines (part of) the rates of change of the SUS coordinates consistent with Hamiltonian mechanics. We refer to the possible values of the Hamiltonian as the *SUS energies*. The second component is the thermal environment which exerts a stochastic influence on the system dynamics.

We design the lab to guarantee that a specific control parameter value *x*(*t*) is applied to the SUS at every time *t* over the time interval \(t \in (0, \tau )\). That is, the control parameter evolves deterministically as a function of time. The deterministic trajectory taken by the control parameter *x*(*t*) over the computation interval is the *control protocol*, denoted by \(\overrightarrow{x}\). The SUS microstate *s*(*t*) exhibits a response to the control protocol over the interval, following a stochastic trajectory denoted \(\overrightarrow{s}\).

For a given microstate trajectory \(\overrightarrow{s}\), the net energy transferred from the lab to the SUS is defined as the *work*, which has the following form [5]:

This is the energy accumulated in the SUS directly caused by changes in the control parameter.

Given an initial microstate \(s_0\), the probability of a microstate trajectory \(\overrightarrow{s}\) conditioned on starting in \(s_0\) is denoted:

With the SUS initialized in microstate distribution \(\varvec{\mu }_{0}\), the unconditioned *forward process* gives the probability of trajectory \(\overrightarrow{s}\):

*Detailed fluctuation theorems* (DFTs) [37, 38] determine thermodynamic properties of the computation by comparing the forward process to the *reverse process*. This requires determining the conditional probability of trajectories under time-reversed control:

The reverse control protocol is
, where
\(x^\dagger \) is *x*, but with all time-odd components (e.g., magnetic field) flipped in sign. And, the *reverse process* results from the application of this dynamic to the final distribution
\(\varvec{\mu }_{\tau }\) of the forward process with microstates conjugated:

The Crooks DFT [38] then gives an equality on both the dissipated work (or entropy production) that is produced as well as the required work for a given trajectory induced by the protocol:

here is itself a SUS microstate trajectory with .

Due to their practical relevance, we consider protocols that are symmetric under time reversal . That is, the reverse-process probability of trajectory \(\overrightarrow{s}\) conditioned on starting in microstate \(s_0\) is the same as that of the forward process:

However, the unconditional reverse-process probability of the trajectory \(\overrightarrow{s}\) is then:

This leads to a version of Crooks’ DFT that can be used to set modified bounds on a computation’s dissipation:

Suppose, now, that the final and initial SUS Hamiltonian configurations
\({\mathcal {H}}_{SL}(s,x(\tau ))\) and
\({\mathcal {H}}_{SL}(s,x(0))\) are both designed to store the same information about the SUS. The SUS microstates are partitioned into locally-stable regions that are separated by large energy barriers in these energy landscapes. On some time scale, a state initialized in one of these regions has a very low probability of escape and instead locally equilibrates to its locally-stable region. These regions can thus be used to store information for periods of time controlled by the energy barrier heights. Collectively, we refer to these regions as *memory states*
\(\varvec{{\mathcal {M}}}\).

Then the probability of the system evolving to a memory state \(m'\in \varvec{{\mathcal {M}}}\) given that it starts in a memory state \(m\in \varvec{{\mathcal {M}}}\) under either the forward or reverse process is:

where
\(\bigl [ \!\! \bigl [E \bigr ] \!\! \bigr ]\) evaluates to one if expression *E* is true and zero otherwise.

To simplify the development, suppose that the energy landscape of each memory state looks the same locally. That is, up to translation and possibly reflection and rotation, each memory state spans the same volume in microstate space and has the same energies at each of those states. Further, suppose that the SUS starts and ends in a metastable distribution, differing from global equilibrium only in the weight that each memory state is given in the distribution. Otherwise, the distributions look identical to the global equilibrium distribution at the local scale of any memory state. This ensures that the average change in SUS energy is zero, simplifying the change \(\Delta \) in nonequilibrium free energy \({\mathcal {F}}\) [12]:

where
\(H( \cdot )\) is the Shannon entropy (in nats), and
\({\mathcal {M}}_t\) is the random variable for the memory state at time *t*. Finally, suppose that the time reversal of a microstate changes neither the memory state it exists in nor its equilibrium probability, for any time during the protocol. This holds for memory states distinguished primarily by the positions of the system particles and system Hamiltonians that are unchanging under time reversal. See Ref. [12] for details behind these assumptions and generalized bounds without them.

Then we have the following inequality:

where:

See Appendix A for a proof sketch.

Recalling that \(\beta \langle W_\text {diss}(\overrightarrow{s})\rangle = \beta ( \langle W (\overrightarrow{s})\rangle - \Delta {\mathcal {F}} )\) and appealing to the inequality in Eq. (2), we find a simple bound on the average work over the protocol:

This provides a bound on the work that depends solely on the logical operation of the computation, but goes beyond Landauer’s bound.

Since we are addressing modern computing, we consider processes that approximate deterministic computations on the memory states. For such computations there exists a computation function \(C: \varvec{{\mathcal {M}}}\rightarrow \varvec{{\mathcal {M}}}\) such that the physically-implemented stochastic map approximates the desired function up to some small error. That is, \(P'(m\rightarrow C(m)) = 1 - \epsilon _m\) where \(0 < \epsilon _m\ll 1\). In fact, we require all relevant errors to be bound by a small error-threshold \(\epsilon \ll 1\). That is, for all \(m' \ne C(m)\) let \(P'(m, m') = \epsilon _{{m}\rightarrow {m'}}\) so that \(0 \le \sum _{m'\ne C(m)} \epsilon _{{m}\rightarrow {m'}} = \epsilon _m\le \epsilon \ll 1\).

We can then simplify Eq. (3)’s bound in the limit of small \(\epsilon \). First, we show that \(d(m, m') \ge 0\) for any pair of \(m, m'\) in the small \(\epsilon \) limit, where we have:

If \(C(m) = m'\), then \(P'({m} \rightarrow {m'}) = 1 - \epsilon _m\ge 1 - \epsilon \), so that:

which vanishes as \(\epsilon \rightarrow 0\). And, if \(C(m) \ne m'\), then \(P'({m} \rightarrow {m'}) = \epsilon _{{m}\rightarrow {m'}}\), so that:

which also vanishes as \(\epsilon \rightarrow 0\). Setting this asymptotic lower bound on the dissipation of each transition facilitates isolating divergent contributions, such as those we now consider.

An *unreciprocated* memory transition \(C(m) = m'\) is one that does not map back to itself: \(C(m') \ne m\). The contribution to the dissipation bound is:

As \(\epsilon \rightarrow 0\), this gives:

That is, as computational accuracy increases (\(\epsilon \rightarrow 0\)), \(d(m, m')\) diverges. This means the minimum-required work (Eq. (3)) must then also diverge.

We then arrive at our simplified bound for the small-\(\epsilon \) high-accuracy limit from Eq. (3)’s inequality on dissipation by only including the contribution from unreciprocated transitions \(m' = C(m)\) for which \(m \ne C(m')\):

In this way, we see how computational accuracy drives a thermodynamic cost that diverges, overwhelming the Landauer-erasure cost. A similar logarithmic relationship between dissipated work and error was demonstrated in the context of the adaptation accuracy of Escherichia coli and other simple biological systems [39].

The bound in Eq. (5) also applies to digital computing such as that performed with dynamic random-access memory (DRAM). We recognize that its operation places the device in a nonequilibrium steady state, appearing to negate the applicability of Crooks’ fluctuation theorem in Eq. (1). However, the remedy for systems whose steady states are nonequilibrium is simply to replace the equality with an inequality, implying that more work must be dissipated than in the case of a local-equilibrium steady state [40]. Thus, our derived bounds must still hold for these modern computing devices.

## 3 Erasure Thermodynamics

Inequalities Eqs. (3) and (5) place severe constraints on the work required to process information via time-symmetric control on memories. The question remains, though, whether or not these bounds can actually be met by specific protocols or if there might be still tighter bounds to be discovered.

To help answer this question, we turn to the case, originally highlighted by Landauer [1], of erasing a single bit of information. This remarkably simple case of computing has held disproportionate sway in the development of thermodynamic computing compared to other elementary operations. The following does not deviate from this habit, showing, in fact, that there remain fundamental issues. We explore this via two different implementations: The first described via two-state rate equations and the second with an underdamped double-well potential—Landauer’s original, preferred setting.

Suppose the SUS supports two (mesoscopic) memory states, labeled \(\text {L}\) and \(\text {R}\). The task of a time-symmetric protocol that implements erasure is to guide the SUS microscopic dynamics that starts with an initial \(50-50\) distribution over the two memory states to a final distribution as biased as possible onto the \(\text {L}\) state. The logical function \(C\) of perfect bit erasure is attained when \(C(\text {L}) = C(\text {R}) = \text {L}\), setting either memory state to \(\text {L}\). The probabilities of incorrectly sending an \(\text {L}\) state to \(\text {R}\) and an \(\text {R}\) state to \(\text {R}\) are denoted \(\epsilon _\text {L}\) and \(\epsilon _\text {R}\), respectively.

Error generation is described by the binary asymmetric channel [41]—the *erasure channel* \({\mathcal {E}}\) with conditional probabilities:

For any erasure implementation, this Markov transition matrix gives the error rate \(\epsilon _\text {L}=\epsilon _{\text {L}\rightarrow \text {R}}\) from initial memory state \({\mathcal {M}}_0=\text {L}\) and the error rate \(\epsilon _\text {R}=\epsilon _{\text {R}\rightarrow \text {R}}\) from the initial memory state \({\mathcal {M}}_0=\text {R}\).

Noting first that \(d(m, m) = 0\) generically, we then have:

So, the bound of Eq. (3) simplifies to:

where \(\langle \epsilon \rangle = (\epsilon _\text {L}+ \epsilon _\text {R})/2\) is the average error for the process.

Notice further that \(C(C(\text {L})) = \text {L}\) but \(C(C(\text {R})) \ne \text {R}\), indicating that only the computation on \(\text {R}\) is nonreciprocal. Therefore, the bound of Eq. (5) simplifies to:

Applying Eq. (7) to DRAM directly provides a quantitative comparison beyond a formal divergence of energy costs. Contemporary DRAM exhibits a range of “soft” error rates around \(10^{-22}\) failures per write operation [42]. In fact, each write operation is effectively an erasure. (The quoted statistic is an average of 4, 000 correctable errors per 128 MB DIMM per year.) Using Eq. (7), this gives a thermodynamic cost of \(25~k_\text {B}T\), which is markedly larger than Landauer’s \(k_\text {B}T\ln 2\) factor. It is also, just as clearly, smaller by a factor of roughly 10 than the contemporary energy costs per logic operation displayed in Fig. 1. These numerical results on the ability to meet our bounds for the case of bit erasure support the conclusion that modern computers can still be improved in efficiency, despite that efficiency being ultimately limited by the bounds we introduced. The conclusion is further reinforced by the numerical simulations in the following sections that nearly achieve our theoretical bounds.

### 3.1 Erasure with Two-state Rate Equations

A direct test of time-symmetric erasure requires only a simple two-state system that evolves under a rate equation:

obeying the Arrhenius equations:

where the states are labeled \(\{\text {L}, \text {R}\}\) and the terms \(\Delta E_\text {R}(t)\) and \(\Delta E_\text {L}(t)\) in the exponentials are the activation energies to transit over the energy barrier at time *t* for the Right and Left wells, respectively.

These dynamics can be interpreted as a coarse-graining of thermal motion in a double-well potential energy landscape *V*(*q*, *t*) over the positional variable *q* at time *t*. Above, *A* is an arbitrary constant, which is fixed for the dynamics. \(q^*_\text {R}\) and \(q^*_\text {L}\) are the locations of the Right and Left potential well minima, respectively. Thus, assuming that \(q=0\) is the location of the barrier’s maximum between them, we see that the activation energies can be expressed as \(\Delta E_\text {R}(t) =V(0,t)-V(q^*_\text {R},t)\) and \(\Delta E_\text {L}(t) =V(0,t)-V(q^*_\text {L},t)\). By varying the potential energy extrema \(V(q^*_\text {R},t)\), \(V(q^*_\text {L},t)\), and *V*(0, *t*) we control the dynamics of the observed variables \(\{ \text {L}, \text {R}\}\) in much the same way as is done with physical implementations of erasure where barrier height and tilt are controlled in a double-well [43].

Deviating from previous investigations of efficient erasure, where Landauer’s bound was nearly achieved over long times [43, 44], here the constraint to time-symmetric driving over the interval \(t \in (0, \tau )\) results in additional dissipated work. As Landauer described [1], erasure can be implemented by turning on and off a tilt from \(\text {R}\) to \(\text {L}\)—a time-symmetric protocol. However, to achieve higher accuracy, we also lower the barrier while the system is tilted energetically towards the \(\text {L}\) well.

Consider a family of control protocols that fit the profile shown in Fig. 2. First, we increase the energy tilt from \(\text {R}\) to \(\text {L}\) via the energy difference \(V(q^*_\text {R},t)-V(q^*_\text {L},t)\) measured in units of \(k_\text {B}T\). This increases the relative probability of transitioning \(\text {R}\) to \(\text {L}\). However, with the energy barrier at its maximum height, the transition takes quite some time. Thus, we reduce the energy barrier *V*(0, *t*) to its minimum height halfway through the protocol \(t= \tau /2\). Then, we reverse the protocol, raising the barrier back to its default height to hold the probability distribution fixed in the well and untilt so that the system resets to its default double-well potential.

Increasing the maximum tilt—given by \(V(q^*_\text {R},\tau /2)-V(q^*_\text {L},\tau /2)\) at the halfway time—increases erasure accuracy. Figure 3 shows that the maximum error \(\epsilon = \max \{ \epsilon _\text {R},\epsilon _\text {L}\}\) decreases nearly exponentially with increased maximum energy difference between left and right, going below 1 error in every 1000 trials for our parameter range. Note that \(\epsilon \) starts at a very high value (greater than 1/2) for zero tilt, since the probability \(\epsilon _\text {R}=\epsilon \) of ending in the \(\text {R}\) well starting in the \(\text {R}\) well is very high if there is no tilt to push the system out of the \(\text {R}\) well.

Figure 3 also shows the relationship between the work and the bounds described above. Given that our system consists of two states \(\{\text {L}, \text {R}\}\) and that we choose a control protocol that keeps the energy \(V(q^*_\text {L},t)\) on the left fixed, the work (marked by green \(+\)s in the figure) is [5]:

This work increases almost linearly as the error reduces exponentially.

As a first comparison, note that the Landauer bound \(\langle W \rangle ^\text {Landauer}_\text {min}=- k_\text {B}T\Delta H({\mathcal {M}}_t)\) (marked by orange \(\times \)s in the figure) is still valid. However, it is a very weak bound for this time-symmetric protocol. The Landauer bound saturates at \(k_\text {B}T \ln 2\). Thus, the dissipated work—the gap between orange \(\times \)s and green \(+\)s—grows approximately linearly with increasing tilt energy.

In contrast, Eq. (6)’s bound \(\langle W \rangle _\text {min}^{t\text {-sym}}\) for time-symmetric protocols is much tighter. The time-symmetric bound is valid: marked by blue circles that all fall below the calculated work (green \(+\)s). Not only is this bound much stricter, but it almost exactly matches the calculated work for a large range of parameters, with the work only diverging for higher tilts and lower error rates.

Finally, the approximate bound \(\langle W \rangle _\text {min}^\text {approx} = \frac{ k_\text {B}T}{2}\ln \epsilon ^{-1}\) (marked by red \(+\)s) of Eq. (7), which captures the error scaling, behaves as expected. The error-dependent work bound nearly exactly matches the exact bound for low error rates on the right side of the plot and effectively bounds the work. For lower tilts, this quantity does not bound the work and is not a good estimate of the true bound, but this is consistent with expectations for high error rates. This approximation should only be employed for very reliable computations, for which it appears to be an excellent estimate. Thus, the two-level model of erasure demonstrates that the time-symmetric control bounds on work and dissipation are reasonable in both their exact and approximate forms at low error rates.

### 3.2 Erasure with an Underdamped Double-well Potential

The physics in the rate equations above represents a simple model of a bistable thermodynamic system, which can serve as an approximation for many different bistable systems. One possible interpretation is a coarse-graining of the Langevin dynamics of a particle moving in a double-well potential. To explore the broader validity of the error–dissipation tradeoff, here we simulate the dynamics of a stochastic particle coupled to a thermal environment at constant temperature and a work reservoir via such a 1D potential. Again, we find that the time-symmetric bounds are much tighter than Landauer’s, reflecting the error–dissipation tradeoff of this control protocol class.

Consider a one-dimensional particle with position and momentum in an external potential and in thermal contact with the environment at temperature *T*. We consider a protocol architecture similar to that of Sect. 3.1, but with additional passive substages at the beginning middle and end: (i) hold the potential in the symmetric double-well form, (ii) positively tilt the potential, (iii) completely drop the potential barrier between the two wells, (iv) hold the potential while it is tilted with no barrier, (v) restore the original barrier, (vi) remove the positive tilt, restoring the original symmetric double-well, and (vii) hold the potential in this original form.

As a function of position *q* and time *t*, the potential then takes the form:

with constants \(a, b_0, c_0 > 0\). The protocol functions \(b_f(t)\) and \(c_f(t)\) evolve in a piecewise linear and time-symmetric manner according to Table 1, where \(t_0, t_1, \ldots , t_7 = 0, \tau /12, 3\tau /12, 5\tau /12, 7\tau /12, 9\tau /12, 11\tau /12, \tau \). The potential thus begins and ends in a symmetric double-well configuration with each well defining a memory state. During the protocol, though, the number of metastable regions is temporarily reduced to one. Figure 4 (top three panels) shows the protocol functions over time as well as the resultant potential function at key times for one such set of protocol parameters; see nondimensionalization in Appendix II. At any time, we label the metastable regions from most negative position to most positive the \(\text {L}\) state and, if it exists, the \(\text {R}\) state.

We simulate the motion of the particle with underdamped Langevin dynamics:

where \(\lambda \) is the coupling between the thermal environment and particle, *m* is the particle’s mass, and *r*(*t*) is a memoryless Gaussian random variable with \(\langle r(t) \rangle = 0\) and \(\langle r(t) r(t') \rangle = \delta (t-t')\). The particle is initialized to be in global equilibrium over the initial potential \(V(\cdot , 0)\). Figure 4 (bottom panel) shows 100 randomly-chosen resultant trajectories for a choice of process parameters.

The work done on a single particle over the course of the protocol with trajectory \(\bigl ( q(t) \bigr )_t\) is [5]:

Figure 5 shows the net average work over time for an erasure process, comparing it to (i) the Landauer bound, (ii) the exact bound of Eq. (6), and (iii) the approximate bound of Eq. (7). Notice that the final net average work lies above all three, as it should and that the time-symmetric bounds presented here are tighter than Landauer’s.

We repeat this comparison for an array of different parameters for the erasure protocol. As described in Appendix II, we vary features of the dynamics—including mass *m*, temperature *T*, coupling to the heat bath \(\lambda \), duration of control \(\tau \), maximum depth of the potential energy wells, and maximum tilt between the wells. Nondimensionalization reduces the relevant parameters to just four, allowing us to explore a broad swath of possible physical erasures with 735 different protocols. For each protocol, we simulate 100,000 trajectories to estimate the work cost and errors \(\epsilon _\text {R}\) and \(\epsilon _\text {L}\) of the operation.

Figure 6 compares the work spent for each of the 735 erasure protocols to the sampled maximum error \(\epsilon = \max (\epsilon _\text {L},\epsilon _\text {R})\). Each protocol corresponds to a green cross, whose vertical position corresponds to the shifted work \(\langle W\rangle _\text {shift}\), which accounts for inhomogeneities in the error rate. Note that the exact bound \(\langle W \rangle _\text {min}^{t\text {-sym}}\) from Eq. (6) reduces to a simple relationship between work and error tolerance \(\epsilon \) when the errors are homogeneous \(\epsilon _\text {R}=\epsilon _\text {L}=\epsilon \):

which we plot with the blue curve in Fig. 6. The cost of inhomogeneities in the error is evaluated by the difference between this reference bound and the exact work bound. This cost is added to the calculated work for each protocol to determine the shifted work:

such that the vertical distance between \(\langle W \rangle _\text {shift}\) and \(\langle W \rangle _\text {ref}^{t\text {-sym}}\) in Fig. 6 gives the true difference \(\langle W \rangle - \langle W \rangle _\text {min}^{t\text {-sym}}\) between the average sampled work and exact bound for the simulated protocol.

Figure 6 shows that the shifted average works for all of the simulated protocols in green, including error bars, all lay above the reference work bound in blue. Thus, we see that all simulated protocols satisfy the bound \(\langle W \rangle \ge \langle W \rangle _\text {min}^{t\text {-sym}}\). Furthermore, many simulated protocols end up quite close to their exact bound. There are protocols with small errors, but they have larger average works. The error–dissipation tradeoff is clear.

The error–dissipation tradeoff is further illustrated in Fig. 6 by the red line, which describes the low-\(\epsilon \) asymptotic bound \(\langle W \rangle _\text {min}^\text {approx}\) given by Eq. (7). In this semi-log plot, it rather quickly becomes an accurate approximation for small error.

Finally, Fig. 6 plots the Landauer bound \(\langle W \rangle _\text {min}^\text {Landauer}\) as a dotted orange line. It is calculated using the final probability of the \(\text {R}\) mesostate. The bound is weaker than that set by \(\langle W \rangle _\text {ref}^{t\text {-sym}}\). As \(\epsilon \rightarrow 0\), the gap between \(\langle W \rangle _\text {ref}^{t\text {-sym}}\) and \(\langle W \rangle _\text {min}^\text {Landauer}\) in Fig. 6 relentlessly increases. The stark difference in the energy scale of the time-symmetric bounds developed here and that of the looser Landauer bound shows a marked tightening of thermodynamic bounds on computation.

Notably, the protocol Landauer originally proposed to erase a bit requires *significantly more work* than his bound \(k_\text {B}T \ln 2\) to reliably erase a bit. This extra cost is a direct consequence of his protocol’s time symmetry. It turns out that time-*asymmetric* protocols for bit erasure have been used in experiments that more nearly approach Landauer’s bound [45, 46]. Although, it is not clear to what extent time asymmetry was an intentional design constraint in their construction, since there was no general theoretical guidance until now for why time-symmetry or asymmetry should matter. Figures 3 and 6 confirm that Ref. [46]’s time-asymmetric protocol for bit erasure—where the barrier is lowered before the tilt, but then raised before untilting—is capable of reliable erasure that is more thermodynamically efficient than any time-symmetric protocol could ever be.

These underdamped simulations drive home the point that our bounds are independent of the details of the dynamics used for computation. Our results are very general in that regard. As long as the system *starts* metastable and is then driven by a time-symmetric protocol, the error–dissipation tradeoff quantifies the minimal dissipation that will be incurred (for a desired level of computational accuracy) by the time the system relaxes again to metastability.

## 4 Conclusion

We adapted Ref. [12]’s thermodynamic analysis of time-symmetric protocols to give a detailed analysis of the trade-offs between accuracy and dissipation encountered in erasing information.

Reference [12] showed that time symmetry and metastability together imply a generic error–dissipation tradeoff. The minimal work expected for a computation \({\mathcal {C}}\) is the average nonreciprocity. In the low-error limit—where the probability of error must be much less than unity (\(\epsilon \ll 1\))—the minimum work diverges according to:

Of all of this work, only the meager Landauer cost \(\Delta {\text {H}}({\mathcal {M}}_t)\), which saturates to some finite value as \(\epsilon \rightarrow 0\), can be thermodynamically recovered in principle. Thus, irretrievable dissipation scales as \(\ln (\epsilon ^{-1} )\). The reciprocity coefficient \( \bigl \langle \bigl [ \!\! \bigl [{\mathcal {C}}({\mathcal {C}}({\mathcal {M}}_0))\ne {\mathcal {M}}_0 \bigr ] \!\! \bigr ]\bigr \rangle _{{\mathcal {M}}_0}\) depends only on the deterministic computation to be approximated. This points out likely energetic inefficiencies in current instantiations of reliable computation. It also suggests that time-asymmetric control may allow more efficient computation—but only when time-asymmetry is a free resource, in contrast to modern computer architecture.

The results here verified these general conclusions for erasure, showing in detail how tight the bounds can be and, for high-reliability thermodynamic computing, how they overwhelm Landauer’s. It may be fruitful to explore the ideas behind our results in explicitly quantum, finite, and even zero-temperature systems. Refined versions of Landauer’s bound and other thermodynamic results can be obtained for such models [47, 48]. Also, explicit consideration of finite-time protocols can reveal efficiency advantages when treating ensembles of systems under majority-logic decoding [49,50,51]. Perhaps analogous refinements of the results presented here can be found as well.

Despite the almost universal focus on information erasure as a proxy for all of computing, we now see that there is a wide diversity of costs in thermodynamic computing. Looking to the future, these costs must be explored in detail if we are to design and build more capable and energy efficient computing devices. Beyond engineering and sustainability concerns, explicating Landauer’s Stack will go a long way to understanding the fundamental physics of computation—one of Landauer’s primary goals [52]. In this way, we now better appreciate the suite of thermodynamic costs—what we called Landauer’s Stack—that underlies modern computing.

## References

Landauer, R.: Irreversibility and heat generation in the computing process. IBM J. Res. Dev.

**5**(3), 183–191 (1961)Parrondo, J.M.R., Horowitz, J.M., Sagawa, T.: Thermodynamics of information. Nat. Phys.

**11**(2), 131–139 (2015)Boyd, A.B., Mandal, D., Crutchfield, J.P.: Identifying functional thermodynamics in autonomous Maxwellian ratchets. New J. Phys.

**18**, 023049 (2016)Gomez-Marin, A., Parrondo, J.M.R., Van den Broeck, C.: Lower bounds on dissipation upon coarse graining. Phys. Rev. E

**78**(1), 011107 (2008)Deffner, S., Jarzynski, C.: Information processing and the second law of thermodynamics: an inclusive. Hamiltonian approach. Phys. Rev. X

**3**, 041003 (2013)Li, J., Horowitz, J.M., Gingrich, T.R., Fakhri, N.: Quantifying dissipation using fluctuating currents. Nat. Commun.

**10**(1), 1–9 (2019)Still, S.: Thermodynamic cost and benefit of memory. Phys. Rev. Let.

**124**(5), 050601 (2020)Gopalkrishnan, M.: A cost/speed/reliability tradeoff to erasing. In: Calude, C.S., Dinneen, M.J. (eds.) Unconventional Computation and Natural Computation. Lecture Notes in Computer Science, vol. 9252, pp. 192–201. Springer, Berlin (2015)

Lahiri, S., Sohl-Dickstein, J., Ganguli, S.: A universal tradeoff between power, precision and speed in physical communication. arXiv:1603.07758

Boyd, A.B., Patra, A., Jarzynski, C., Crutchfield, J.P.: Shortcuts to thermodynamic computing: the cost of fast and faithful information processing. arXiv:1812.11241

Boyd, A.B., Mandal, D., Crutchfield, J.P.: Thermodynamics of modularity: structural costs beyond the Landauer bound. Phys. Rev. X

**8**, 031036 (2018)Riechers, P.M., Boyd, A.B., Wimsatt, G.W., Crutchfield, J.P.: Balancing error and dissipation in computing. Phys. Rev. Res.

**2**(3), 033524 (2020)Kolchinsky, A., Wolpert, D.H.: Dependence of dissipation on the initial distribution over states. J. Stat. Mech. Theory Exp.

**2017**(8), 083202 (2017)Riechers, P.M.: Transforming metastable memories: the nonequilibrium thermodynamics of computation. In: Wolpert, D., Kempes, C., Stadler, P., Grochow, J. (eds.) The Energetics of Computing in Life and Machines. SFI Press, Santa Fe (2019)

Boyd, A.B., Mandal, D., Riechers, P.M., Crutchfield, J.P.: Transient dissipation and structural costs of physical information transduction. Phys. Rev. Lett.

**118**, 220602 (2017)Riechers, P.M., Crutchfield, J.P.: Fluctuations when driving between nonequilibrium steady states. J. Stat. Phys.

**168**(4), 873–918 (2017)Loomis, S., Crutchfield, J.P.: Thermodynamically-efficient local computation and the inefficiency of quantum memory compression. Phys. Rev. Res.

**2**(2), 023039 (2019)Technology Working Group. The International Technology Roadmap for Semiconductors 2.0: Executive Summary, p. 2015. Technical report, Semiconductor Industry Association (2015)

Conte, T., et al.: Thermodynamic computing. arxiv:1911.01968

Technology Working Group: The International Roadmap for Devices and Systems: 2020. Executive Summary. Technical report, Institute of Electrical and Electronics Engineers (2020)

Technology Working Group: The International Roadmap for Devices and Systems: 2020. More Moore. Technical report, Institute of Electrical and Electronics Engineers (2020)

Technology Working Group: The International Roadmap for Devices and Systems: 2020. Beyond CMOS. Technical report, Institute of Electrical and Electronics Engineers (2020)

Shalf, J.: The future of computing beyond Moore’s law. Philos. Trans. R. Soc.

**378**, 20190061 (2020)Klages, R., Just, W., Jarzynski, C. (eds.): Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond. Wiley, New York (2013)

Sagawa, T.: Thermodynamics of information processing in small systems. Prog. Theor. Phys.

**127**(1), 1–56 (2012)Berut, A., Arakelyan, A., Petrosyan, A., Ciliberto, S., Dillenschneider, R., Lutz, E.: Experimental verification of Landauer’s principle linking information and thermodynamics. Nature

**483**, 187–190 (2012)England, J.L.: Dissipative adaptation in driven self-assembly. Nat. Nanotech.

**10**(11), 919 (2015)Saira, O.-P., Matheny, M.H., Katti, R., Fon, W., Wimsatt, G., Han, S., Crutchfield, J.P., Roukes, M.L.: Nonequilibrium thermodynamics of erasure with superconducting flux logic. Phys. Rev. Res.

**2**, 013249 (2020)Bennett, C.H.: Thermodynamics of computation–a review. Int. J. Theor. Phys.

**21**, 905 (1982)Sagawa, T.: Thermodynamic and logical reversibilities revisited. J. Stat. Mech. Theory Exp.

**2014**(3), P03025 (2014)Zulkowski, P.R., DeWeese, M.R.: Optimal finite-time erasure of a classical bit. Phys. Rev. E

**89**, 052140 (2014)Zulkowski, P.R., DeWeese, M.R.: Optimal protocols for driven quantum systems (2014). arXiv:1506.03864

Zulkowski, P.R., DeWeese, M.R.: Optimal control of overdamped systems. Phys. Rev. E

**92**(3), 032117 (2015)Aurell, E., Gawȩdzki, K., Mejía-Monasterio, C., Mohayaee, R., Muratore-Ginanneschi, P.: Refined second law of thermodynamics for fast random processes. J. Stat. Phys.

**147**(3), 487–505 (2012)Proesmans, K., Ehrich, J., Bechhoefer, J.: Finite-time Landauer principle. Phys. Rev. Lett.

**125**(10), 100602 (2020)Proesmans, K., Ehrich, J., Bechhoefer, J.: Optimal finite-time bit erasure under full control. Phys. Rev. E

**102**(3), 032105 (2020)Jarzynski, C.: Hamiltonian derivation of a detailed fluctuation theorem. J. Stat. Phys.

**98**(1–2), 77–102 (2000)Crooks, G.E.: Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E

**60**, 2721 (1999)Lan, G., Sartori, P., Neumann, S., Sourjik, V., Tu, Y.: The energy-speed-accuracy trade-off in sensory adaptation. Nat. Phys.

**8**(5), 422 (2012)Riechers, P.M., Crutchfield, J.P.: Fluctuations when driving between nonequilibrium steady states. J. Stat. Phys.

**168**(4), 873–918 (2017)Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley-Interscience, New York (2006)

Schroeder, B., Pinheiro, E., W.-D. Weber: DRAM errors in the wild: a large-scale field study. In: SIGMETRICS/Performance’09, Seattle, WA, pp. 1–12 (2009)

Jun, Y., Gavrilov, M., Bechhoefer, J.: High-precision test of Landauer’s principle. Phys. Rev. Lett.

**113**, 190601 (2014)Hong, J., Lambson, B., Dhuey, S., Bokor, J.: Experimental test of Landauer’s principle in single-bit operations on nanomagnetic memory bits. Sci. Adv.

**2**, e1501492 (2016)Dillenschneider, R., Lutz, E.: Memory erasure in small systems. Phys. Rev. Lett.

**102**, 210601 (2009)Jun, Y., Gavrilov, M., Bechhoefer, J.: High-precision test of Landauer’s principle in a feedback trap. Phys. Rev. Lett.

**113**, 190601 (2014)Reeb, D., Wolf, M.M.: An improved Landauer principle with finite-size corrections. New J. Phys.

**16**(10), 103011 (2014)Timpanaro, A.M., Santos, J.P., Landi, G.T.: Landauer’s principle at zero temperature. Phys. Rev. Lett.

**124**(24), 240601 (2020)Miller, H.J.D., Guarnieri, G., Mitchison, M.T., Goold, J.: Quantum fluctuations hinder finite-time information erasure near the Landauer limit. Phys. Rev. Lett.

**125**(16), 160602 (2020)Sheng, S., Herpich, T., Diana, G., Esposito, M.: Thermodynamics of majority-logic decoding in information erasure. Entropy

**21**(3), 284 (2019)Proesmans, K., Bechhoefer, J.: Erasing a majority-logic bit (2020). arXiv:2010.15885

Landauer, R.: Private communication with J. P. Crutchfield (1981)

## Acknowledgements

The authors thank the Telluride Science Research Center for hospitality during visits and the participants of the Information Engines Workshops there for helpful discussions. JPC acknowledges the kind hospitality of the Santa Fe Institute, Institute for Advanced Study at the University of Amsterdam, and California Institute of Technology for their hospitality during visits. This material is based upon work supported by, or in part by, FQXi Grant number FQXi-RFP-IPW-1902, Templeton World Charity Foundation Power of Information fellowship TWCF0337, and U.S. Army Research Laboratory and the U.S. Army Research Office under contract W911NF-13-1-0390 and grant W911NF-18-1-0028.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

Communicated by Sebastian Deffner.

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendices

### A: Proof of Exact Bound for Time-Symmetric Protocols

Here, we prove Eq. (2). This is an exact bound under the following conditions: the protocol is time-symmetric; it operates on systems that start and end in metastable equilibrium; the corresponding metastable regions look the same in state space versus energy up to translation, rotation, and reflection in state space; the memory states are symmetric under time reversal; and the equilibrium probability of a microstate at any time does not change under time reversal of the microstate. These conditions are all met in a wide variety of computational processes, including the bit erasure processes studied here. For a more general bound that applies under fewer restrictions, see Ref. [12].

We start with Eq. (1), Crooks’ DFT applied to time-symmetric protocols:

To help with the algebra, we introduce a second way to define reverse process probabilities:

We also allow both \(P\) and *Q* to take as arguments system microstate trajectories \(\overrightarrow{s}\), pairs of initial and final memory states \(\overrightarrow{m}= (m(0), m(\tau ))\), or a combination of the two, where \(\overrightarrow{m}\) may even appear as a condition. We define the time reverse of the pair of initial and final memory states as . (Note that \(m^\dagger \equiv \{ s^\dagger : s \in m \}\) and we explicitly assume in the following that the each memory state is time-reversal invariant; i.e., \(m^\dagger = m\) for all \(m \in \varvec{{\mathcal {M}}}\).) For example, the probability of observing the reverse trajectory in the reverse process conditioned on observing the reverse of the pair of initial and final memory states is:

where \(\overrightarrow{M}\) is the random variable for the pair of initial and final memory states.

To derive Eq. (2), we first show that the average dissipated work is bounded by a Kullback–Leibler divergence on forward-process memory-state distributions compared with reverse-process memory-state distributions. Letting \(\overrightarrow{{\mathcal {M}}}\) be the set of all possible pairs of initial and final memory states:

where \(\text {D}_\text {KL} [ \cdot \Vert \cdot ] _{\overrightarrow{\mathcal {M}}}\) is the Kullback–Leibler divergence for distributions over all \(\overrightarrow{m}\in \overrightarrow{{\mathcal {M}}}\) and \(\text {D}_\text {KL}[\cdot ||\cdot ]_{\vec {m}}\) is that for all \(\overrightarrow{s}\) consistent with \(\overrightarrow{m}\). Since Kullback–Leibler divergences are nonnegative, we arrive at the following inequality:

Second, we show that the above Kullback–Leibler divergence equals the righthand side of Eq. (2).

To help with this second task, we start by establishing that \(Q({\mathcal {M}}_0=m| {\mathcal {M}}_\tau =m') = P({\mathcal {M}}_\tau =m|{\mathcal {M}}_0=m')\). We have:

The denominator is simply the probability of observing the memory state \(m'\) at the end of the forward process:

Since the process begins and ends in a metastable distribution, the probability of observing a particular microstate given some memory state is the same as the probability of observing that microstate given that the system has locally equilibrated to that memory state. And, since the process is time symmetric, the starting and ending local equilibrium distributions are the same. That is, letting \(\pi _t^{(m')}(s)\) denote the local equilibrium probability of observing microstate \(s\) given memory state \(m'\) at time *t*:

Additionally:

We then have:

Note that in the second to last line we simply changed the symbols denoting the variables of integration.

We now invoke our assumption that the memory partitions are chosen to be time-reversal invariant, such that \(m^\dagger = m\) for all \(m \in \varvec{{\mathcal {M}}}\). If we furthermore assume that the equilibrium probability of a microstate does not change under time reversal, then time-reversal symmetry of the memory states implies that the local equilibrium probability of a microstate does not change either:

We therefore have:

This allows us to rewrite the memory-state trajectory probability under the reverse process in terms of forward process probabilities:

We now show that \(\text {D}_\text {KL} [ P \Vert Q ] _{\overrightarrow{\mathcal {M}}}\) equals the right hand side of Eq. (2), completing the proof:

where \(H({\mathcal {M}}_t)\) is the entropy of the memory state at time *t*.

### B: Langevin Simulations of Erasure

To help simulate a wide variety of protocols, we first nondimensionalize the equations of motion, using variables:

Note that the position scale \(\sqrt{b_0 / 2a}\) is the distance from zero to either well minima in the default potential \(V(\cdot , 0) = V(\cdot , \tau )\). Substitution then provides the following nondimensional equations of motion:

with:

which is the first nondimensional parameter to specify an erasure process.

The nondimensional potential can be expressed as:

where:

are two more nondimensional parameters to specify and:

simply express \(b_f\) and \(c_f\) with the nondimensional time as input. The fourth and final nondimensional parameter is the nondimensional total time:

To explore the space of possible underdamped erasure dynamics, we simulate 735 different protocols, determined by all combinations of the following values for the four nondimensional parameters: \(m' \in \{ 0.25, 1.0, 4.0 \}\), \(\alpha \in \{ 2, 4, 7, 10, 12 \}\), \(\zeta \in \{ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 \}\), and \(\tau ' \in \{ 4, 8, 16, 32, 64, 128, 256 \}\). \(10^5\) trials of each parameter set were simulated.

For the simulations of Figs. 4 and 5, we set \(m'=1\), \(\alpha =7\), \(\zeta =0.4\), and \(\tau '=100\). Figure 6 shows that the (error, work) pairs obtained for these various dynamics fill in the region allowed by our time-symmetric bounds. These bounds can indeed be tight, but it is quite possible to waste more energy if the computation is not tuned for energetic efficiency.

To update particle position and velocity each time step, we used fourth-order Runge–Kutta integration for the deterministic portion of the equations of motion and a simple Euler method in combination with a Gaussian number generator for the stochastic portion. To determine the time step size, we considered a range of possible time steps for 81 of the possible 735 parameter sets and looked for convergence of the sampled average works and maximum errors \(\epsilon \), again using \(10^5\) trials per parameter set.

The maximum errors were stable over the whole range of tested step sizes. Exploring decreasing step sizes, a step size of 0.0025 was chosen since there the average works stopped fluctuating within \(5 \sigma \) of their statistical errors for all 81 parameter sets. The error bars presented for the average works in Fig. 6 were then generously set to be 5 times the estimated statistical errors, which were each obtained by dividing the sampled standard deviation by the square root of the number of trials. Error bars for the maximum errors were set to be the statistical errors of \(\epsilon _\text {L}\) or \(\epsilon _\text {R}\), depending on which was the maximum, whose statistical errors were obtained by assuming binomial statistics.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Wimsatt, G.W., Boyd, A.B., Riechers, P.M. *et al.* Refining Landauer’s Stack: Balancing Error and Dissipation When Erasing Information.
*J Stat Phys* **183**, 16 (2021). https://doi.org/10.1007/s10955-021-02733-1

Received:

Accepted:

Published:

DOI: https://doi.org/10.1007/s10955-021-02733-1

### Keywords

- Nonequilibrium steady state
- Thermodynamics
- Dissipation
- Entropy production
- Landauer bound