Revisiting the Monge problem in the Landauer limit

We discuss the Monge problem of mass transportation in the framework of stochastic thermodynamics and revisit the problem of the Landauer limit for finite-time thermodynamics, a problem that got the interest of Krzysztof Gawedzki in the last years. We show that restricted to one dimension, optimal transportation is efficiently solved numerically by well known methods from differential equations. We add a brief discussion about the relevance this has on optimising the processing in modern computers.


Introduction
The last decades have seen an increased interest in establishing a thermodynamic understanding of nonequilibrium processes occurring in small systems.However, unlike macroscopic systems to which thermodynamics apply, small systems are characterised by relatively large fluctuations and time scales that are comparable to their relaxation times, rendering a thermodynamic description inappropriate.
A great advance in our understanding of the thermodynamics of small systems has been made in the last decades due mainly to two developments [11]: on the one hand the discovery of the nonequilibrium fluctuation relations [9], relating the statistics of molecular fluctuations with the microscopic symmetries of the dynamics of small systems far from equilibrium, and ultimately giving theoretical support for the validity of the second law of thermodynamics [8,10,14,18,23], and on the other hand the development of stochastic thermodynamics that derives a thermodynamic framework valid for the single fluctuating trajectories [28,29,27,21].These results have been successfully applied to a broad range of disciplines, including physics [7], chemistry [25], biological systems [27], active matter [30], and computer processing [33], among a host of many others.
One such nonequilibrium process is life itself, maintained at the molecular level by molecular motors determining the finely tuned kinetics of the cell.Molecular motors are responsible of essential life processes such as vesicle transport or cell division [15,26], and the synthesis of ATP [31].Notwithstanding the strong fluctuations these molecular complexes display, it is of central importance to understand the efficiency molecular motors operate.
Another instance in which efficiency plays a central role is that of processing in modern computers.On the one hand, every computation involves an energetic thermodynamic cost.On the other hand, due to the miniaturisation of computer processors, modern transistors operate at scales at which stochastic thermodynamics applies.However, scaling of computer transistors presents nowadays technological challenges, mostly related to the increase of dissipation, thus limiting the scaling up in the number of transistors and therefore, the possible optimisation of the speed of computation [22].This calls for the development of highly efficient computing processes that operate at minimal dissipation.
A fundamental process for computers is that of memory erasure.The minimal amount of energy required to erase a bit of memory is given by the celebrated Landauer limit [17], stating an energetic lower bound of k B T ln 2, where T is the temperature and k B the Boltzmann constant.This energy is eventually dissipated into the environment as heat.The Landauer limit is achieved when the erasure of a bit occurs along a quasi-static process.In practice, memory swaps happen at finite short time scales in a process which is inherently noisy and out of equilibrium.
Fourteen years ago Krzysztof got interested on the mathematical formalisation of (at that time incipient area), nonequilibrium fluctuation relations [6], as well as on the Landauer limit [1].
The memory erasure process can be stated as a finite time nonequilibrium process between an initial state at which a bit is observed to be in a specific state and a final state in which the bit is absent, and indeed the Landauer limit was experimentally verified like this [4].In general, an external control on the nonequilibrium transition can be devise to obtain optimal processes that minimise e.g., dissipation [24,2].In particular, in Ref. [2] such optimisation was shown to be solved for the Langevin dynamics in the overdamped limit by the Monge-Kantorovich optimal mass transport and the Burgers equation.This is particularly relevant in computer processing as there one searches for minimal dissipation but at the same time, fast processing.
In 2012, Krzysztof and his colleagues published a paper [1] in which they related the Landauer Principle [17] to the Monge-Kantorovich optimal mass transport.In a talk by Krzysztof in Geneva in April 2013, "2nd law of thermodynamics for fast random processes" he showed how the Landauer Principle is related to an overdamped Langevin evolution from an initial state to a different final stated.In discussions with him, the question came up on how to efficiently and precisely compute the dissipated energy.There were many methods around at the time, and Krzysztof based his discussion on the papers by Benamou and Brenier [13,12].He also gave a sequence of 3 talks in Helsinki on this subject "Fluctuations relations in stochastic thermodynamics" where an extensive list of references can be found [11].Since the authors of [1] restricted the discussion to 1 dimension, one can use methods from differential equations which are more precise than the methods the used in that paper.This will be part of our contribution, which illustrates the breath of Krzysztof's interests and competence.We miss him.

Stochastic Thermodynamics
We consider finite time nonequilibrium transitions in d dimensions, with dynamics described by the Langevin equation in the overdamped limit where U (x t , t) is a smooth control potential and ξ t white noise with zero mean dξ i t = 0, and covariance dξ i t dξ j t = 2D ij δ(t − t )dt.The diffusion matrix D and mobility matrix µ appearing above, are assumed positive and satisfying the Einstein relation D = k B T µ where k B is the Boltzmann constant, so that the noise models the fluctuations of a thermal bath at temperature T .
During the nonequilibrium transition we assume that the control potential changes from U (x 0 , 0) = V i (x) at time t = 0 to U (x τ , τ ) = V f (x) at time t = τ .Given an initial probability density (x 0 , 0) = i (x), x t defines a Markov diffusion process for times t > 0, with generator The probability density evolves according to the Fokker-Planck equation where Following Ref. [28], the energy balance for the single fluctuating trajectories of the process eq. ( 2.1) leads to the framework of stochastic thermodynamics, developed to give a thermodynamic description to small systems in contact with a heat bath and driven out of equilibrium (see e.g., Refs.[27,21] for a modern review).
Defining the work done on the system during the time interval τ as and the heat released by the system into the environment in the Stratonovich convention as with ∆U = U (x τ , τ ) − U (x 0 , 0), expresses the conservation of energy that holds for every fluctuating trajectory of the transition process, in analogy to the First Law of thermodynamics.
To obtain a fluctuating version of the Second Law of thermodynamics, we first notice that the entropy change associated with the transition from time 0 to τ , can be split into two contributions (2. 3) The first term on the right hand side of eq. ( 2.3) corresponds to the entropy change of the system due to the evolution of the probability density where is simply the Gibbs-Shannon entropy with respect to the instantaneous probability density.
The second contribution on the right hand side of eq. ( 2.3) corresponds to the change of entropy of the environment due to the dissipated heat where Q is the mean heat released during the transition.
To obtain the second law it is expedient to define the current velocity of the process v(x t , t) [20].We first note that the instantaneous probability density for a Markov diffusion process can be written as Moreover, the Fokker-Planck equation eq. ( 2.2), yielding the evolution of (x t , t) can be rewritten as the advection equation in the current velocity defined as The current velocity, defined through an appropriate limiting procedure [20,3,1], has the interpretation of the mean local velocity of the process x t .Correspondingly, in terms of the current velocity, the Fokker-Planck equation is equivalent to deterministic mass transport.Now, using eq.(2.6), the entropy change of the system eq.(2.4), can be written after integration by parts as where Ṡsys is the time derivative of S sys .
In the same way, the change of entropy of the environment becomes (2.9) Combining both contributions and using eq.(2.7), the total entropy change eq. ( 2.3) becomes This expression implies immediately the fluctuating analogue of the second law of thermodynamics, namely ∆S tot > 0 . (2.11) 3 The optimal mass transport The mass transport problem was first considered by Gaspard Monge in 1781 [19].Roughly speaking, the problem consists in calculating the most economic way of moving a volume of mass between two places.The modern approach of Monge's optimal mass problem was formalised by Kantorovich in 1942 [16] (see also [12,32]).Optimal mass transport is nowadays referred as the Monge-Kantorovich problem.Here we adopt Monge's exposition.Let i (x) and f (x) be two probability densities, bounded and with compact support in the reals, and satisfying The optimisation problem is to find an invertible smooth map ϕ := x f (x i ), that is measure preserving, namely and minimises the objective function where C x, x f (x) is the cost transporting the unit mass from its initial distribution i into a final distribution f .In its original formulation, Monge considered the Euclidean distance as the cost function C x, x f (x) = x − x f (x) .However, the cost function can be taken as C x, x f (x) = x − x f (x) r .We will show that the case r = 2 is particularly relevant to formulate a refined Landauer limit for finite time processes.This case was solved by Benamou and Brenier in 1999 [13], and was used by Krzysztof and colleagues in 2012 in relation to the Landauer limit [1].

Minimal dissipation memory erasure
The second law states that for any thermodynamic transformation between two given initial and final states, the total entropy change must be positive, such as in eq.(2.11).This is valid for any thermodynamic transformation, even for quasistatic processes occurring infinitely slowly.Consider now a thermodynamic transformation constrained to be completed in a fixed finite time τ .Dissipation is naturally expected to be larger than in the quasistatic transformation, and the question that arises is: what is the minimal possible dissipation produced in a finite-time transformation?This question was answered in Ref. [1] for isothermal stochastic systems with Langevin dynamics in the overdamped limit of section 2.
This question is particularly relevant to computer processing.Landauer cost of information processing, stating that the erase of a bit of information is performed at a cost of no less than k B T ln 2 dissipated heat [17].The Landauer limit continues to be the main reference in information processing because the process of bit erasure is the elementary operation that produces maximal dissipation in universal computing with transistor logic gates [22].
In this section we review briefly this optimal solution by following Ref.[1].Consider the stochastic process x t of eq.(2.1), driven out of equilibrium by the control U (x t , t) from a state i (x) at time t = 0 to a state f (x) at time t = τ .The goal is to obtain the optimal choice of the control U (x t , t) minimising the dissipation, as given by eq.(2.10), required to drive the system along such transformation, over all the densities (x t , t) and all velocity fields v(x t , t) satisfying eq.(2.6), under the constraint In other words, we need to minimise the functional Apart from a factor τ , the minimisation of eq. ( 4.2) over the fields (x t , t) and v(x t , t) subject to eq. (2.6) and eq.(4.1), was solved by Benamou and Brenier [13] (see also [1]).There it was shown that the optimal velocity current minimising eq. ( 4 where φ(x t , t) is convex and a solution of the Hamilton-Jacobi equation.Eq. ( 4.3) implies that the optimal solution corresponds, through eq.(2.7), to the optimal control U (x t , t), and that v is also the local velocity of the optimal control.Restricting ourselves to smooth velocity fields v such that the Lagrangian trajectories x(t) satisfy ẋ(t) = v(x(t), t), the solution to the advection equation eq.(2.6) is given by where x(t; x i ) denotes the Lagrangian trajectory that at time t = 0 passes through x i .Under the controlled transformation, the Lagrangian map x i → x f (x i ) should transport the initial density i into the final density f .Substituting eq.(4.4) into eq.( 4.2), we can replace the minimisation of the functional A over the velocity fields v into the minimisation of over the Lagrangian flows satisfying This constraint is equivalent to where is the Jacobean of the Lagrangian map.We require the Lagrangian map to be smooth and invertible, with a smooth inverse x f → x i (x f ).
Minimising first over time under the above constraints we realise that for a positive definite matrix µ the minimal Lagrangian trajectories correspond to straight lines Therefore, the optimal solution is completed once the functional is minimised over all Lagrangian maps x i → x f (x i ).
Eq. (4.7) corresponds to the Monge-Kantorovich transportation problem of eq.(3.2), with a quadratic cost, solved in Ref. [13], and in Refs.[2,3,1] in the context of stochastic thermodynamics.In Ref. [2] it was shown that minimisation of eq.(4.7) is solved by the Burgers equation over the velocity potential φ, and mass transport by the Burgers velocity field eq.(2.6).
Once the minimiser x f (x i ) is obtained, the minimal value of the functional eq.( 4.2) is where C min is the value of the quadratic cost eq.(4.7) over the minimiser Lagrangian map eq.(4.6).Then it follows that the minimum entropy production during a transition from i (x) to f (x) satisfying eq.(3.1) in a fixed time τ is Finally, eq.(2.5) and the value eq.(4.8) yield a Landauer bound for the average dissipated heat during the erasure of one bit of information in overdamped Langevin dynamics

Numerical solution of the assignation problem
Given an initial i (x) and final f (x) densities, in this section we deal with the problem of solving numerically the assignation problem to find the minimiser of the Lagrangian map.This was done in Ref. [1] by means of several methods.Direct integration of the constraint eq.(4.5) was found to become unstable at values of x i for which the derivative dx f (x i )/dx i diverges (and similarly for the Figure 1: The initial distribution i (x) (black) and the final distribution f (x) (red).While the problem is of course quite easy, intuitively, the issue is how to find the "best" numerical solution.
inverse map).Similar problems were also found using a rearrangement "auction" algorithm [5].After discussing this with Krzysztof in Geneva in 2012, we decided to use another, hopefully faster and more precise method.In the rest of this section we show its implementation and discuss its performance and general limitations.We assume that i (x) and f (x) are given and without loss of generality, satisfy eq.(3.1).The minimiser of the Lagrangian map, namely the optimal map x i → x f (x i ) that transports i into f and minimises the cost eq.(4.7), satisfies eq.(4.5).
We consider the optimal mass transport that was considered in Ref. [1]: where Z i and Z f are normalisation factors so that eq.(3.1) is satisfied, and the constants a = 112k B T µm −4 and α = 0.5µm were chosen to match the experi- mental realisation of Ref. [4].As a matter of fact, eq. ( 2.1) models the dynamics in the experiment with a mobility µ = 0.213877 . Furthermore, the control potential U (x t , t) is effectively one-dimensional 1 .The initial and final densities of eq.(5.1) are shown in Fig. 1.
The minimiser of the Lagrangian map, namely the optimal map x i → x f (x i ) that transports i into f and minimises the cost eq.(4.7), satisfies for all x i , and uniqueness is guaranteed if for each x i one chooses the minimal x f (x i ).However, a basic problem to take into account when solving eq. ( 5.2) is that, if we are given i and f then the normalisation of the integrals is only numerically guaranteed to the available precision of the computer.This implies that the numerical solution ceases to exist at the ends of the supports of i and f .
To have a better control over this difficulty one can solve the equivalent con- straint eq.(4.5), that in one dimension reads . ( The problem of finding the minimiser map is simply transformed into solving an ODE.eq. ( 5.3) defines a vector field in the (x i x f ) plane that can be easily obtained numerically 2 .For any point in this plane, the vector field determines the local evolution of eq.(5.3).We show this in Fig. 2 for a number of points (x i x f ) chosen randomly.
Note that the vector field is not really defined, outside the central region (the central lobe in Fig. 2), as the problem has infinite derivatives in the vertical direction.The same happens at the left and right ends of the central area where the derivatives are zero and thus, the inverse of the Lagrangian map is not well defined.
In view of Fig. 2, solving eq. ( 5.3) requires to choose appropriate initial conditions.It is clear that it is a good idea to start somewhere in the center, at some height y ∼ 0.5 and to integrate both backward and forward.Up to numerical precision of about 10 −20 we found that integrating backwards from x i = 0 fixed, the best numerical initial condition is x f = 0.493113178303063601340966142029819 x Figure 4: The vector field of the flow of the numerical convergence, with the minimiser of the Lagrangian map overlaid in blue.and when integrating forward x f = 0.493113178303063601752771313946797.These values represent the numerical possibilities for the example of eq.(5.1).They were obtained by shooting as follows: we start with x f = 0.5, which is about the center of Fig. 2. We integrate backwards and check for which x i the solution ceases to exist, either because the solution blows up (the graph is vertical), or the solution becomes constant (the graph is horizontal).We then try another value of x f (0) (e.g., x f = 0.46), measure the divergence value for this new initial condition, and then solve by bisection for better and better values of x f (0) until the value of x i at which divergence is observed, cannot be improved any more.
To exemplify our shooting, the approximations through the bisection procedure are shown in Fig. 3 for the backward integration (left panel) and forward integration (right panel).We needed about 110 iterations to reach the maximal precision.The speed of convergence can certainly be improved by e.g., quadratic interpolation.
The solution we obtain in this way is shown in Fig. 4 as the blue curve.It yields a discrete approximation of the minimiser of the continuous Lagrangian map, that can be improved by increasing the numerical precision.
Finally, since the current velocity is gradient (see eq. (4.3)), the current veloc-ity is simply the time derivative of eq.(4.6) and the optimal control potential is obtained from eq. (2.7) (see Ref. [1] for further details).

Conclusions
In this paper we have explored the numerical solution of the Monge problem of optimal transportation, applied to the bit erasure problem and the Landauer limit explored by Krzysztof in Ref. [1].
The densities of eq. ( 5.1) are appropriate to discuss the problem of erasing a bit: At the initial time t = 0, the probability for a particle to be at the left (around x = −0.5),or at the right (around x = 0.5), is the same.This corresponds to the bit to be in state 0 or 1 with equal probability, and is well described by a Gibbs state i ∝ exp − 1 k B T V i (x) of eq. ( 5.1), with a potential V i (x) with two symmetric wells separated by a sufficiently high barrier.At final time t = τ , the final Gibbs states f corresponds to a potential with only one of the two wells, in our case to a well centered at x = 0.5.This means the final state of the bit is 1, irrespective of its initial state.
The particular choice of eq. ( 5.1) was obtained to reasonably match those used in the experiment reported in Ref. [4].The entropy change between i and f is ∆S sys ≈ −0.74312k B , slightly smaller than −(ln 2)k B .
To obtain the minimal dissipated heat during this transition, we reduced the optimal mass transport in one dimensions to an ODE problem, and showed that shooting allows to find the optimal solution through bisection.The crucial information to obtain a convergent method was the knowledge of the vector field of the solution space (see Fig. 2).
Plugging the solution of the minimiser Lagrangian map into eq.(4.7), we obtain the minimal cost C min = 1.98897268kB T .In conclusion, erasing a bit through a transformation whose dynamics is described by eq.(2.1) evolving in a finite time τ between the states of eq.(5.1) can be performed by an average dissipated heat which is We consider the value of the minimal cost that we obtained an improvement to the value 1.996k B T that Krzysztof obtained in [1].We wish we could have discussed this with him.

Figure 2 :
Figure 2: Vector field of the solution of the Lagrangian map x f (x i ) for the example of eq.(5.1).

Figure 3 :
Figure 3: The successive improvement of the shooting solution by changing the initial condition until the solution exists over a maximal extent.