Three types of Landauer's erasure principle: a microscopic view

An important step to incorporate information in the second law of thermodynamics was done by Landauer, showing that the erasure of information implies an increase in heat. Most attempts to justify Landauer's erasure principle are based on thermodynamic argumentations. Here, using just the time-reversibility of classical microscopic laws, we identify three types of the Landauer's erasure principle depending on the relation between the two final environments: the one linked to a logical input 1 and the other to the logical input 0. The strong type (which is the original Landauer's formulation) requires the final environments to be in thermal equilibrium. The intermediate type giving the entropy change of $k_B \ln 2$ occurs when the two final environments are identical macroscopic states. Finally, the weak Landauer's principle, providing information erasure with no entropy change, when the two final environments are macroscopically different. Even though the above results are formally valid for classical erasure gates, a discussion on their natural extension to quantum scenarios is presented. This paper strongly suggests that the original Landauer's principle (based on the assumption of thermalized environments) is fully reasonable for microelectronics, but it becomes less reasonable for future few-atoms devices working at THz frequencies. Thus, the weak and intermediate Landauer's principles, where the erasure of information is not necessarily linked to heat dissipation, are worth investigating.


Introduction
For more than a century, important efforts have been devoted to understand the entropic and energetic costs of manipulating information. The first attempt for incorporating information into thermodynamics was as early as 1871 when James Clerk Maxwell presented the gedanken experiment, now known as Maxwell's demon [1] (a demon in the middle of a container with a trapdoor could transfer the fast and hot particles from a cold side to a hot one, in apparent violation of the second law of thermodynamics, if he had enough information about the particle velocities and positions). An analysis of Maxwell's demon was conducted by Szilard [2] as early as 1929 when he studied an idealized heat engine with one particle gas and directly associated the information acquired by measurement with the physical entropy. Any practical implementation of the Maxwell's demon requires a finite memory to store information about decisions whether, for each particle, the trapdoor will be open or closed. Charles Bennett [3], and independently Oliver Penrose [4], clarified that the erasure of each bit of information in the memory requires a dissipation of heat in the environment, thus recovering the validity of the second law when the memory (demon) and its environment are properly included into the thermodynamic discussion. Bennett's and Penrose's conclusions were based on the previous work of Rolf Landauer [5] in 1961, showing that the erasure of information requires dissipation of a (minimum) amount of heat equal to k B T ln 2, where k B is the Boltzmann's constant and T is the temperature. The work of Landauer is considered a key element on what Bennett named thermodynamics of computation [3,[6][7][8] or what nowadays is known by the more general term of information thermodynamics [9][10][11][12] as seen in Fig. 1.
The original motivation of Landauer's work as a part of his job as a researcher at the International Business Machines Corporation (IBM), however, was not devoted to establish a link between information and thermodynamics, but just to find the minimal (if any) amount of heat dissipated by an ideal computer. He brilliantly anticipated the minimum dissipation of Fig. 1 Solid symbols denote the number of times cited as a function of year for the keywords Landauer's erasure principle (red), Information thermodynamics (green) and Thermodynamics of computation (orange). Open symbols denote the same information for some of the relevant papers mentioned in the references of this manuscript. The seminal work of Landauer reached a maximum of attention in the literature when its first experimental validation by Berut et al. [14]. The data are extracted from Ref. [43]. k B T ln 2 per bit. The heat dissipated in computers is nowadays much larger, and it is the real bottleneck that prevents further progress. The power dissipated in electron devices is directly proportional to the working frequency. Thus, the higher frequency at which we make computations, the more heat is dissipated. The overall amount of power that can be dissipated from the chip imposes a limit on the operating frequency around 5 GHz for real computers, as seen in Fig. 2. In other words, the technology to build a single transistor working at frequencies as high as 1 THz is well developed, but not the technology to extract the amount of heat generated in a chip with 10 10 of such transistors [44]. In Fig. 2 we indicate the power dissipation for the first computer built in 1945. It was named Electronic Numerical Integrator And Computer (ENIAC) [45] and it required 174 kilowatts of power to run 5000 simple addition or 300 multiplications per second, with a clock rate of 100 kHz. The typical measure of computer performance is given in floating point operations per second (FLOPS). Although the ENIAC did not work with bits, we can estimate its computer performance around 500 Flops with a power efficiency of 3 · 10 −12 gigaflops/watt (see lower green dot Fig. 2). We compare ENIAC with nowadays supercomputers. In the November 2020 ranking of supercomputers in terms of energy efficiency [46], the NVIDIA DGX Super-POD was the most energy-efficient supercomputer with 26.2 Gigaflops/watt. This demonstrates an awesome improvement of 12 orders of magnitude in energy efficiency during last 75 years. We can compare such numbers with Landauer's prediction by noticing that a floating-point operation reads in two numbers and returns one. If this is done on a computer with finite memory capacity, eventually the number which is being returned must erase another number in memory. Thus, according to Landauer's erasure principle stating that a dissipation of k B T ln 2 ≈ 2.8 · 10 −22 Joules is required by bit erased, one Joule of energy for an ideal Landauer computer would enable to re-write 3.6 · 10 20 bits. Using 64 bits for a floating-point number, one Joule of energy would allow about 6 · 10 18 floating point operations, which means 6 · 10 9 Gigaflops/watt. See the frequency-independent result in the orange line in Fig. 2. Certainly, the fact that even the most energy-efficient computers today are still 8 orders of magnitude below the Landauer limit implies that the electronic industry needs to solve many problems before the Landauer's erasure principle becomes a relevant issue [47].
The overall message of the Landauer's erasure principle is that, even after developing the best technology in the future that will minimize the problems of heat dissipation in computers by 8 orders of magnitude, we will still be faced with the fact that some heat dissipated (k B T ln 2 per bit) will not be an unnecessary nuisance, but a fundamental part of data erasure that cannot be avoided in any way, independently of the details of the computing device.
The central topic of our paper is how fundamental is the Landauer's erasure principle and if some type of extension (or generalization) is possible. In general, the Landauer's erasure principle is presented (and understood) in the literature as a fundamental result that cannot be avoided in any way. But, is it universally true that, independently of the details of the computing device, a heat dissipation of k B T ln 2 per bit cannot be avoided when data are erased? We anticipate that the fact that the Landauer limit in Fig. 2 is independent of the frequency (processor speed) is suspicious because the process of thermalization (the change from a non-equilibrium to an equilibrium thermodynamic state) is a dynamical process that requires some time in either classical or quantum reservoirs.
Since the Landauer's erasure principle is based on a thermodynamic explanation of computations, at first sight, it seems that the preliminary question that we have to answer is: how fundamental is thermodynamics? Thermodynamics is a scientific discipline that explains complex systems through macroscopic properties, avoiding a need to discuss microscopic details. Historically, the thermodynamic laws were developed only for systems in the so-called thermodynamic equilibrium. In recent years, however, thermodynamics as a scientific theory has evolved to systems outside of thermodynamic equilibrium [48]. The so called classical irreversible thermodynamics, under the hypothesis of local equilibrium, borrows most of the concepts and tools of equilibrium thermodynamics to non-equilibrium systems. Nowadays, even systems outside of local equilibrium are being studied in different Solid green symbols denote the power efficiency of different CPU's as a function of the processor speed. The orange line denotes the power efficiency limit of the strong (original) Landauer's erasure principle. The blue shaded region corresponds to processor speeds (operating frequencies) close or above 1 THz where the assumption of thermal environment is less evident. The yellow dashed line shows the tendency in last decades indicating that computers will reach the non-thermal equilibrium before reaching the Landauer limit, so that predictions of computing efficiency based on non-thermal reservoirs will become more relevant than the strong (original) Landauer's erasure limit. branches of thermodynamics [48]. Thus, whether a computing device represents a system that can be studied with some branch of thermodynamic is not a question. By the own flexibility of thermodynamics as a scientific discipline, it is always possible to construct a branch of thermodynamics with the ability to predict the macroscopic behavior of computing devices, even outside of thermodynamic equilibrium. Thermodynamics is becoming a science of everything [49], including a science of information thermodynamics.
The path followed in this paper to understand the universality (or the lack thereof) of the Landauer's erasure principle is a study of the erasure of information from a microscopic (mechanical) point of view, just by assuming the time-reversibility of microscopic laws, and then checking whether our general results (independent of any thermodynamic concepts) coincide or not with the original Landauer's erasure principle. We show that depending on the type of final environment involved in the erasure of a logical 1 or a logical 0, three results can be established. The original Landauer's erasure principle, which we refer to as a strong type of Landauer's erasure principle, is recovered when the final state of the environment is in a thermodynamic equilibrium. Alternatively, an intermediate relation between manipulation of information and entropy change can be deduced when the only (macroscopic) condition imposed on the final environment is that they look indistinguishable (from a macroscopic point of view) when different logical inputs are involved. Such an intermediate relation gives the well-known limit k B ln 2 of entropy change when applied to an erasure gate. Finally, for states of environment that look distinguishable we establish the weak type of Landauer's erasure principle which imply no entropy change for erasure computations.
Thus, we conclude that the original (strong) Landauer's erasure result is not universal because thermal reservoirs are not universal. As we shall discuss in the last part of this paper, there are modern reservoirs/environments that never thermalize. Moreover, in case of thermalization, the dynamical transition from non-thermal to thermal reservoirs requires some time. In other words, as depicted by the shaded region of Fig. 2, the thermal reservoir assumption of the strong Landauer's erasure principle cannot be accepted uncritically for computing devices that switch from one state to the other faster than the time required to thermalize the reservoir. In the modern language of open systems [50], these fast changing gate involve non-Markovian environments. As previously indicated, it is important to clarify that the limitations of the Landauer's erasure principle are not limitations of the information thermodynamics itself because it is always possible to include some type of macroscopic effects of such non-Markovianity in thermodynamic formulations of computation, beyond the original Landauer's erasure principle.
The structure of the rest of the paper is as follows. In Sec. 2 we define microscopic and macroscopic states, the physical characteristics of a logical gate and the requirements imposed by the time-reversibility of microscopic laws (Liouville theorem). In Sec. 3 we define three types of the relation between manipulation of information and entropy change: strong, weak and an intermediate one, corresponding to three different types of final environments. Finally, we provide a discussion on how the above results can be extended to quantum systems in Sec. 4. We conclude in Sec. 5. We also add two appendixes with technical details.

Definitions
In this section we provide detailed definitions of microscopic and macroscopic states in a general classical erasure gate. The proper understanding of when a set of microscopic states is (or is not) identical to a macroscopic state will be the key-element in the developments of Sec. 3.

Defining microscopic states
We consider a closed (or isolated in the thermodynamic language) system with N degrees of freedom. We distinguish the N S degrees of freedom of the system (the active region of the computing gate) and the N E = N − N S environment degrees which represent all the other degrees of freedom. The degrees of freedom of the system are represented by the vector x with 6N S components corresponding to three position and three momenta of each particle in the physical space. Similarly, the degrees of freedom of the environment are represented by y as a vector in the 6N E -dimensional phase space of the environment [51]. The interaction between all degrees of freedom is determined by the (time-independent) Hamiltonian H(x, y), which fully describes the physical implementation of the logical gate. Definition 1 (Microscopic state) We define a microscopic state of the gate and environment at time t by the point x (j) (t), y (j) (t) in the 6N-dimensional phase space Γ, where the superscript j labels different solutions (corresponding to different experiments) from the same Hamiltonian H(x, y).
In general, we will consider j = 1, .., M with M large enough (but not infinite) so that the set of x (j) (t), y (j) (t) is statistically meaningful.

Defining macroscopic properties and macroscopic states
After the definition of microscopic states, we define here macroscopic properties and macroscopic states. Definition 2 (Macroscopic property) We define a macroscopic property as a function A : Γ → R that assigns a real value to each point in the phase space Γ. Two phasespace points x (j) (t), y (j) (t) and x (k) (t), y (k) (t) are macroscopically identical (according to this property A) if and only if A(x (j) (t), y (j) (t)) = A(x (k) (t), y (k) (t)).
Notice that there are no anthropomorphic implications in the definition of a macroscopic property [52]. No human observation is needed. In our case, A can be a the logical information of the system denoted by the logical symbols 0 and 1. One can define these macroscopic properties as a result of a large-scale resolution of the apparatus involved in the identification of such property A. There is a large set of microscopic states at the output of the gate that are correctly interpreted as belonging to the logical 0 in the input of another subsequent gate. For a simple and objective definition, for example, a maximum distance from a central phase-space point can be used to specify which microscopic states belong to a given macroscopic property.
Once we have a defined macroscopic property, we can define a macroscopic state.
Definition 3 (Macrostate) We define the macroscopic state (or macrostate) A at time t as the set of all microscopic states x (j) (t), y (j) (t) that have the same macroscopic property A at that time, namely Notice that A is a subspace of Γ, while (bold) A is just a number in real space.
We are now interested in defining the phase-space volume V A of the macrostate A.
Definition 4 (volume of macrostate A) We define the volume of the macrostate A by counting the number of microscopic states that satisfy the condition A(x (j) (t), y (j) (t)) = A in definition 3 as where δ a,b is the Kronecker delta function (that becomes one when A(x (j) (t), y (j) (t)) = A) and ∆Γ is an irrelevant phase-space volume small enough to accommodate zero or one microstate (see appendix A). We remind that M is large enough (but not infinite) so that the results are statistically meaningful.
We will also be interested in identifying those degrees of freedom of the system alone that belong to the set A. We define the system subspace X A as the set of microscopic points in the system phase space Γ S that belong to A, as X A = { All x (j) (t) ∈ Γ S so that x (j) (t), y (j) (t) ∈ A}. Similarly, for the points in the environment phase space Γ E , we define the Environment subspace as Y A = { All y (j) (t) ∈ Γ E so that x (j) (t), y (j) (t) ∈ A} where the whole phase space is just the product of the system and environment phase spaces, Γ = Γ S × Γ E .

Physical gate as a transition between microscopic states
For each j-experiment, the Hamiltonian H(x, y) determines the trajectory in the 6N-dimensional phase space between the initial values x (j) (t i ), y (j) (t i ) and the final values x (j) (t f ), y (j) (t f ). Definition 5 (operation) We define an operation or evolution of the states due to the Hamiltonian H(x, y) as a bijective (one-to-one and onto) map h(t f , t i ) from the phase space Γ at time t i (domain) to the same phase space Γ at time t f (range) where B, as the image of A under the bijective mapping h(t f , t i ), is defined as We notice that no macroscopic property B is used in the description of B as image of A in the definition 5. In other words, the set of microscopic states at the initial time t i , that define a macroscopic state A, does not need to be a macroscopic state of the same macroscopic property A at the latter time t f .
The proof is simple. By construction, the states that belong to B at t f are just the states that belonged to A at time This is, in fact, a simpler way of stating the Liouville theorem [53].
We insist that the (non-capital) volume v B (t f ) do not need to be the volume of a macroscopic state A at the final time t f defined as Obviously, such evolution of microscopic states encodes an evolution of the logical information as well.
Definition 6 (Physical gate) We define a gate at the physical level (with one bit of information that can take two initial logical values) as the following two maps: • A map h 0 (t f , t i ) when the involved initial microscopic states A are those belonging to the information 0 • A map h 1 (t f , t i ) when the involved initial microscopic states A ′ are those belonging to the information 1.
By construction, such a composed map is also a bijective (one-to-one and onto) map In fact, the bijective maps We are now in conditions to present the following proposition that will be important along the paper: The demonstration is simple. Let us imagine that Notice that we have used in the proposition 2 the fact that different trajectories, for example x (j) (t f ), y (j) (t f ) and x (k) (t f ), y (k) (t f ), do not cross in phase-space at any time. This will be the condition that we will check in any proposal of a gate. Notice that physical systems defined from the Hamiltonian H(x, y) do always satisfy this proposition 2. But, the proposition 2 is also true for any (non-Hamiltonian) dynamical system that preserves phasespace volumes. As a consequence, the three Landauer's erasure principles presented in this paper can be relevant, not only for the physical gates linked to H(x, y) studied in this paper, but for applications in other areas outside physics described from divergenceless models.

Logical gate as transition between macroscopic states
From the logical information alone (forgetting about the microscopic state), we define the gate from a logical point of view as: The difference between the physical and logical gate, which is the central point in our future discussion, can be translated to saying that, contrary to the bijective mapping h(t f , t i ) for microscopic states, the new logical map for macroscopic states i(t f , t i ) is not bijective. For example, we will be interested in two operations that define an erasure gate: the 0 → 0 operation and the 1 → 0 operation. Clearly, the map is not bijective. In the language of computation, it is said that the erasure gate is the simplest example of logical irreversibility, because the final (logical) information 0 does not allow us to deduce what was the initial (logical) information (either 0 or 1).
We want to clarify why we said in definition 6 that A, A ′ → B, B ′ is physically reversible (bijective), while we said in definition 7 that A, A' → C, C' can be logically irreversible (not bijective). The first refers to the evolution of microscopic states and the second to the evolution of macroscopic states. By construction, it is possible to find two different phasespace points {x (j) (t i ), y (j) (t i )} = {x (k) (t i ), y (k) (t i )} that have the same (logical) information A(x (j) (t i ), y (j) (t i )) = A(x (k) (t i ), y (k) (t i )), but it is not possible to find two identical (or very similar) phase-space points

Three types of Landauer's erasure principle
Next, we distinguish three types of the relation between the erasure of information and its energetic and entropic costs, corresponding to three types of relation between the two final environments: the final environment belonging to the logical operation 1 → 0 and the one to 0 → 0. Only the third type is the one developed originally by Landauer (in terms of a thermal reservoir). We still keep the name (intermediate and weak) Landauer's erasure principle for the other two because we believe that we follow the original motivation of Landauer: encoding information in macroscopic properties and analyzing how the distribution of microscopic states that build such macroscopic state change during the erasure procedure. But our approach differs from the original one in the sense that we assume nothing more than time-reversibility of the microscopic laws.
We assume that the gate is characterized by the logical information 0 or 1 (or the corresponding macroscopic states A and A ′ in definition 6), while the environment is characterized by another macroscopic property E 0 or E 1 (or its corresponding macroscopic states E in definition 3). Since a gate involves two operations, whenever needed we will specify which operation we are referring to by using, for example in an erasure gate, the label 1 → 0 or 0 → 0. We will also specify the time at which we are defining the macroscopic properties or states, by writing t i for the initial time and t f for the final one.

The weak Landauer's erasure principle
We first consider erasure gates where the final environments are macroscopically different at the final time: • Condition C1: ENVIRONMENTS WITH FINAL DIFFERENT MACRO-SCOPIC PROPERTIES. For two different operations involved in a gate with two initial environment states which have macroscopically identical properties at the initial time (e.g. Let us analyze C1 for an erasure gate in Fig. 3. The initial logical states 1 in Fig. 3(a) and 0 in Fig. 3(c) are different macroscopically (being in the left and in the right respectively), while having the same environment macroscopic properties and states, E 1→0 (t i ) = E 0→0 (t i ) and Y 0→0 (t i ) = Y 1→0 (t i ). By contrast, the final logical states 0 in Fig. 3(b) and 0 in Fig. 3(d) are macroscopically identical (being both on the right), X 0→0 (t f ) = X 1→0 (t f ), while having different environment macroscopic properties and states, We clearly satisfy the proposition 2 at all times so that such an erasure process is possible from our mechanical point of view. Notice that condition C1 implies that the initial macroscopic information will effectively disappear from the final state of the system, but it will appear in the final environment state. These results just show that, due to time-reversibility of microscopic laws, information can never be erased at the microscopic level in a full closed system. We note that we have arrived to the same conclusion as Hemmo and Shenker [30], but within a framework that will allow us to reach Landauer's and Bennett's results in a general and compact unified framework.

Proposition 3
The erasure of information with a gate satisfying condition C1 is compatible with no entropy cost ∆S = 0.
For the proof, we use the Boltzmann entropy defined as the number of microstates that correspond to a macrostate (as discussed in the appendix A). We define V 1→0 (t i ) as the phase- (1) We have assumed an arbitrary probability p for the 1 → 0 and 1 − p for 0 → 0 operations. The reason why ∆S = 0 is possible is because the condition C1 itself imposes that the envi- The proof is just a consequence that the initial and final macrostates (seen as light blue and light red regions for the initial and final times, respectively, in Fig. 3) always have the same number of microstates. This fact is the wellknown result given by the Liouville theorem [53] when dealing with A and the image of A at a later time. We remind the reader that, in more general scenarios, the number of microscopic points that are part of the macroscopic property A(t i ) at the initial time does not need to be equal to the number of points that are part of the macroscopic property A(t f ) at the final time.
In fact, reading carefully the original works of Landauer and Bennett, one notices that the possibility of such types of erasure gates, giving ∆S = 0, was already well known to Landauer and Bennett. Bennett mentioned what he thought was the problem with such types of erasure gates in his 1973 paper [13]. He erroneously concluded that it was not possible to use such erasure gates with condition C1 more than once, because the environment is different each time we use the erasure gate (see the macroscopic states of environment in Fig. 3 (b) and (d)). Contrary to the Bennett's conclusion, we argue here that such erasure gate with ∆S = 0 can be used as many times as typical erasure gates can. It is erroneously argued in [13] that such an erasure gate will require a reset of the environment to its initial state to make the erasure gate useful again. However, we notice that in a conventional gate, in fact, the initial macroscopic state of environment is not identical to the final macroscopic state of environment: the final one contains more heat than the initial one. And yet, no reset to the initial cooler environment is assumed each time the gate is used. Similarly, we can assume that the change of state of the environment in the gate of Fig. 3 is small enough to be used again without reset [56]. See also appendix B with a toy model of an erasure gate satisfying condition C1. This toy-model gate works properly without reset more than 30 times despite the fact that the environment is modified each time in such a way that one can guess what was the initial logical value by just looking at the environment variation.
The true reason why the erasure gate depicted in Fig 3 and conventional gates can be used many times is because of the change in the environment degrees of freedom y. In other words, it is mandatory to change the microscopic degrees of freedom of the environment each time an erasure process takes place. Because of the time reversibility of Hamiltonian dynamics, two initially different trajectories of the system alone without environment, A way to use such erasure gates (with ∆S = 0 or with ∆S = 0) is to require y(t) to be different each time we use the gate, but not too different. Finally, notice that the role played by the environment y(t) in the gates under condition C1 are quite similar to the role played by the control register in the gates of the reversible computation proposed by Bennett [15,16,23]. In both cases, the environment or the control register is the additional degree of freedom y(t) needed to erase the system information x(t) without violating the time-reversibility of the whole system. The difference is that the control register is an active element in reversible computation, while the environment is interpreted here as a passive element without requiring any attention (reset). Fig 3 shows an example on how to realize irreversible logic with reversible physics. We are requiring the final environments to be slightly different at the macroscopic level. See also appendix B with a toy model of an erasure gate that works properly without reset. Certainly, our simplified erasure gate has a limit on the number of times it can be used. But, in principle, it is not different from conventional erasure gates in our computers, because they also have a limit on the number of times that they can be consecutively used, which is related to the limit on the extra heat that can be absorbed by the environment when we take into account that the number N E of environment particles is not strictly infinite.
Does the result obtained above, where the information is erased without entropy cost, violate the original Landauer's prediction? Is the exorcise of the Maxwell demon done by Bennett and Penrose, based on prior Landauer's cost for erasing data, wrong? We notice that our environments in C1, as plotted in Fig. 3, are not in thermodynamic equilibrium, so our results, as such, are not pertinent to discussions about systems that have assumed the hypothesis of thermodynamic equilibrium. Arguments on why we can expect non-thermal environment in some experiments will be discussed in more detail in Sec. 4. However, as discussed in the introduction, thermodynamics is a scientific discipline flexible enough to accommodate these new results into a new (irreversible or non-equilibrium) branch of thermodynamics. Finally, the reader can argue that a fair discussion of the environment in present-day real computers has to involve a really large number of degrees of freedom (N E ≫ 10 23 ) making almost impossible to distinguish final environments, contrary to what we have stated in the C1 condition and in appendix B. Sure, there are many environments in our ordinary life that can be considered as thermal environments. But, we will show in the last section of this paper that recent experiments in equilibration of closed quantum systems show environments that never thermalize or that the transition from a non-thermal to a thermal reservoir (for those which thermalize) needs some time. Thus, gates at very high frequency can imply (non-Markovian) environments that have not enough time to thermalize (to become independent of their initial conditions 1 or 0). There is a huge difference between saying that condition C1 is technologically difficult to reach, and saying that C1 is impossible to reach because it violates fundamental laws. In summary, there is no fundamental reason to expect that only thermal reservoirs can be applied to computations, so there is no reason to expect that the original Landauer limit will be impossible to be overcome in future nano-devices.

The intermediate Landauer's erasure principle
The second type of relation between the erasure of information and entropic and energetic changes can be obtained by assuming that the final macroscopic environments are identical at the macroscopic level: • Condition C2: ENVIRONMENTS WITH IDENTICAL FINAL MACRO-SCOPIC PROPERTIES. For two different operations involved in a gate with two initial environment states which have macroscopically identical properties at the initial time (e.g. E 1→0 (t i ) = E 0→0 (t i )), the two final environment states also have identical macroscopic properties (e.g. E 1→0 (t f ) = E 0→0 (t f )) at the final time [52,54].
Notice that we are not imposing that the initial environment state is macroscopically identical to the final environment state in a given operation (we have shown in the previous subsection that this is impossible for an erasure gate), but only that the two final environment states of the different operations involved in a gate are macroscopically identical.
We analyze again an erasure gate [57] in Fig. 4 with condition C2. The initial logical states 1 in Fig. 4(a) and 0 in Fig. 4(c) are different macroscopically (being on the left and on the right respectively), while having the same environment macroscopic states, E 1→0 (t i ) = E 0→0 (t i ). The final logical states 0 in Fig. 4(b) and 0 in Fig. 4(d) are macroscopically identical (being on the right). Interestingly, we cannot distinguish macroscopically the final environment macroscopic states, E 1→0 (t f ) = E 0→0 (t f ) , as seen in the light red regions in Fig. 4(b) and (d). If we look microscopically at Fig. 4, we see that all the microscopic points (solid red points) in the phase space Γ satisfy the timereversibility imposed by the condition in 2, i.e. the solid red point of Fig. 4(b) never overlap with the solid red points of Fig. 4(d). As we have repetitively stressed, a gate which is physically time-reversible (at the microscopic level) can be logically irreversible (at the macroscopic level). Notice that the distinguishability (or indistinguishability) between two final macroscopic states can have an objective definition, for a example, by imposing a minimum (or maximum) phase space distance between any two final microscopic states belonging to different macroscopic states.

Proposition 4
The erasure of information with a gate satisfying condition C2 implies a minimum entropy cost ∆S = k B ln 2.
We define V 1→0 (t i ) as the phase-space volume of the initial macrostate We have assumed equal a priori probabilities for the 1 → 0 and 0 → 0 operations. The reason why the entropy increase is minimal is because we

be larger than the final macroscopic state. Again the fundamental microscopic proposition 2 is satisfied and a Hamiltonian
The proof is just a consequence that the number of microstates of the final macrostate is not equal to the number of microstates of the image of the initial macrostate, as seen in Fig. 4. This result was already indicated by Landauer himself [5]. Notice, however, that we have made no reference to thermodynamic equilibrium at all in the present development (just counting the number of microscopic states that satisfy a macroscopic property). For this reason, we refer to the result Eq. (2) as the weak Landauer's erasure principle, because it is more general than the original Landauer limit which implicitly assumed that all the entropy increase was due to a production of heat. In this regard, Bennett wrote [16] explicitly: "Typically the entropy increase takes the form of energy imported into the computer, converted to heat, and dissipated into the environment, but it need not be, since entropy can be exported in other ways, for example by randomizing configuration degrees of freedom in the environment.".
The main conclusion of this subsection is that the increase in entropy can be translated into other types of entropies different from thermodynamic entropy. We note that the same conclusion was reached by the works of Vaccaro and Barnett [59,60]. They explicitly generalized the Landauer's erasure principle to new scenarios showing that the costs of erasure depend on the nature of the gate and of the environment with which it is coupled. Their papers were inspired by the enlightening previous work of Jaynes [61] that introduced the concept of the generalized second law instead of the usually called second law of thermodynamics, to emphasize that the concept of entropy (as a way of counting how many microstates belong to a given macrostate, as we have done here) does not belong to (equilibrium) thermodynamics only,  Fig. 4 Schematic representation of the initial (left panels) and final (right panels) microscopic states (dark blue and red solid circles) and the volumes of macrostates (light blue and red regions) in the system x plus environment y phase space. The upper panels correspond to the operation 1 → 0 and the lower panels to 0 → 0. The macroscopic property of the system 1 means being on the left of the x axis, while the macroscopic property of the system 0 means being on the right of x axis. The initial macroscopic environment properties (in the left panels) are identical E 1→0 (t i ) = E 0→0 (t i ). The macroscopic properties of the environment at the final time (in the right panels) are also identical E 1→0 (t f ) = E 0→0 (t f ), in the sense that their microscopic differences are not seen in their macroscopic properties. Even though the gate is logically irreversible, it satisfies the time-reversibility of microscopic laws. The relevant point is that condition C2 shows that each operation of the erasure process is done with an increase in entropy: the phase-space volumes (entropies) of the initial macroscopic states (light blue regions in the left panels) are half of the phase-space volumes (entropies) of the final macroscopic states (red blue regions in the right panels). but can be applied to any system where macroscopic properties matter. We emphasize that, after accepting that the result ∆S = k ln 2 has, in general, nothing to do with heat or temperature, new type of gates can be envisioned by looking for new types of entropy different from thermodynamic entropy converted into heat. Such new possibilities will violate the original Landauer's erasure principle in terms of heat and temperature, without violating Eq. (2) when C2 is assumed.

The strong Landauer's erasure principle
The strong relation between manipulation of information and entropy change leads to the original Landauer's erasure principle. To arrive to it, we invoke the following condition on the final state of environment E B : • Condition C3: MACROSCOPICALLY IDENTICAL FINAL THERMAL ENVIRONMENTS. The final states of environment of different processes of a gate (e.g. Y 1→0 (t f ) and Y 0→0 (t f )) are described by the same thermal bath [52,54].
This condition should be understood as a supplement to C2, i.e. in condition C3 we assume that condition C2 is already satisfied. We are not only imposing that the final states of environment are macroscopically identical, but also that the final states of environment can be described by a state in thermodynamic equilibrium with a well defined temperature T.

Proposition 5
For an erasure gate satisfying C3 (which implies satisfying C2 too), the erasure of information implies an increment of heat given by ∆Q = kT ln 2 in the final environments.
For an environment in thermodynamic equilibrium, it is well known that the increment of heat ∆Q is related to the increment of entropy ∆S through the thermodynamic relation ∆Q = T∆S. Hence, since the increment of entropy is given by Eq. (2), we finally have Here, from the macroscopic property T, it is easy to understand how the conditions Y Expression (3) is exactly the original Landauer's erasure principle [5], which we call the strong Landauer's erasure principle to be distinguished from the previous weak and intermediate ones. The universality of the strong Landauer's erasure principle in Eq. (3) is based on the assumption that all final states of environments are indeed thermal baths (condition C3). Following the arguments in previous sections and in the next section, the condition C3 is a good approximation for many real environments in Nature, but not necessarily valid for all of them (especially if we deal with very fast computations).
At this point, the reader can wonder why do we insist in the failure of the strong (original) Landauer's erasure principle when its limit has been validated by several relevant experiments [14,[35][36][37][38][39][40], as indicated in Fig. (1)? All these experiments [14,[35][36][37][38][39][40] have carefully make an effort to ensure that the environment is in thermal equilibrium. Then, for thermal environments, the strong Landauer's principle is a universal result. Loosely speaking, the experiments are designed to explain the strong Landauer's erasure principle, rather than the other way around. In fact, the mentioned experiments have been developed imposing adiabatic conditions on the performance of the erasure processes which justify that the environment can be treated as a thermal bath. In this regard, the physical transitions seen as left and right distribution of particles in Fig. 4 cannot be done instantaneously. They require some time to thermalize, to change form two distinguishable macrostates to two indistinguishable macrostates. Therefore, it seems obvious that in the race for faster computing devices, at some point, the assumption that the environments of an electron devices are always thermalized will not be accurate enough because the reservoir will not have enough time to thermalize. This very point is in fact what we will discuss in the next section, taking profit of the vast literature on thermalization (or equilibration) in closed quantum systems.

Can the previous results be extended to the quantum regime?
In this paper, we have shown that the original (strong) Landauer's erasure principle cannot be considered a universal result because it is not true that only thermal reservoirs are available for computations. The key element in our discussion is the fact that it is possible to envision final environments for the 1 → 0 and 0 → 0 operations with different macroscopic environment properties. But, can we generalize these results to the quantum regime? Below we provide arguments to justify that it is reasonable to expect that, what we have explicitly demonstrated to be valid for classical erasure gates, is also valid for quantum ones. We note that it is far from the scope of this paper to provide such rigorous quantum extension, here, we only give qualitative evidence of that.
In the quantum regime, the difference between microscopic and macroscopic levels of description is even more important than in classical physics. Microscopic quantum laws seem to be very different from the microscopic classical laws. There is still a strong disagreement in the scientific community on how to define a quantum microscopic state (if it exists at all). In other words, the definition of microscopic states is a rather subtle and controversial issue, because it highly depends on the interpretation of quantum mechanics, on which there is no consensus among physicists [62]. A straightforward demonstration that the developments done in Secs. 2 and 3 can be extended into the quantum regime will be done in the appendix A (after selecting a proper interpretation of quantum mechanics). Fortunately, a simple understanding of why condition C3 is not universal in the quantum regime, and why there is a plenty of room to design erasure gates with conditions C2 and even C1, can be formulated in an (more or less) interpretation-neutral manner (in terms of expectations values) by reusing the recent advances on the process of thermalization of closed quantum systems [63][64][65][66][67][68][69].
Let us suppose that the two operations of the erasure gate are defined by the wave functions Ψ 1 (x, y, 0) for the input logical state 1 and Ψ 0 (x, y, 0) for the input logical state 0. Notice that we are using the variables x and y in the quantum regime as the degrees of freedom of the positions of the system and the positions of the environment, respectively. In this sense, x, y represent a point in the configuration space, while x, y represented a point in the phasespace in the classical regime. The use of the same notation will simplify the comparison of classical and quantum microstates done in the appendix A [70].
Since the total Hamiltonian H(x, y) is time-independent, the pure states Ψ 1 (x, y, 0) and Ψ 0 (x, y, 0) can be described at all times by a unitary evolution |Ψ α (t) = ∑ n c n,α e −iE n t/h |n , with α = {1, 0} indicating the initial logical state. The ket |n is an energy eigenstate of the global Hamiltonian H(x, y) mentioned in Sec. 2 with eigenvalue E n . Here c n,α = n|Ψ α (0) , which depends on the initial wave function, keeps memory of the initial conditions. The density matrix in the energy representation of such global states can be written aŝ The diagonal elements of the density matrix ρ α,n,n = |c n,α | 2 are called populations. They forget the phase of c n,α and they are time-independent. On the other hand, the off-diagonal elements ρ α,m,n (t) = c m,α c * n,α e i(E n −E m )t/h are called coherences. They are time-dependent and they quantify the coherence between the eigenstates |n and |m by keeping memory of the phase of c m,α c * n,α . See [71] for a discussion of the initial energies.
By construction of an erasure gate, at the final time t f , the macroscopic properties linked to the system are identical so that we can identify such macroscopic properties of both quantum states with same final logical sate 0. Then, we assume that a macroscopic property of the environment can be defined from an expectation value [72, 73] of a arbitrary observableÂ of the environment that can be written as: where A m,n = m|Â||n . Thus, the discussion on whether A ρ 1 (t) = A ρ 0 (t) or A ρ 1 (t) ≈ A ρ 0 (t) is a discussion on whether the off-diagonals elements of the density matrix c m,α c * n,α e i(E n −E m )t/h (that keep the memory of the initial state) are relevant in the evaluation of (5).
But such issues have been clarified during the last years in theoretical and experimental works on thermalization of closed quantum systems. In principle, the second term of the right hand side of (5) is a quasi-periodic function different from zero. There are experiments, for example in ultracold quantum gases trapped in ultrahigh vacuum by means of (up to a good approximation) conservative potentials [74,75], that can be considered to be of the type of systems described above, with off-diagonal elements always relevant. The near unitary dynamics of such systems has been observed in beautiful experiments on collapse and revival phenomena of bosonic [76,77] and fermionic [78] fields, without the relaxation phenomena predicted with traditional ensembles of statistical mechanics [79,80]. Thus, as we have argued along the paper, there are computing scenarios where the condition C1 that environments are macroscopically distinguishable is physically viable. In fact, all these works on quantum thermalization of closed systems have been motivated to understand the recent constructions of several prototypes of the so-called quantum simulations where the behavior of a quantum system, which cannot be solved numerically due to the many-body problem of the Schrodinger equation, is empirically realized in the laboratory by studying the evolution of another controlled quantum system that mimics the first one. Obviously, in such (analog) quantum simulations, and also in (digital) quantum computations dealing with qubits, the need of controlled (non-thermal) environments is mandatory for minimization of decoherence phenomena.
It is true that a system satisfying condition C1 requires an important technological effort on engineering the behavior of the environment. In fact, there are other quantum closed systems that do thermalize and such processes have been reasonably well-understood too. At the initial time t = 0, it is clear that A ρ 1 (0) = A ρ 0 (0) because we start from macroscopically different states. But, after some time we can find A ρ 1 (0) ≈ A ρ 0 (0) if the offdiagonal terms become irrelevant. A simple argument can clarify the need for a delay to reach equilibration in a closed quantum system. Even if none of the terms c m,α , c * n,α and A m,n are exactly zero at any time, it is possible to envision a scenario in which the whole sum of the right hand side of (5) is close to zero because the off-diagonals terms cancel each other due to adding of effectively random complex numbers. However, such randomization requires some time, which is called equilibration time t eq in the literature. Then, the time evolution of A after the equilibration time t > t eq , when the off-diagonal elements of the density matrix are no longer relevant, can be described by a time-independent diagonal density matrix in the energy representationρ diag = ∑ n | n|c n | 2 |n | n|. In the literature, it is said that a quantum system suffers equilibration when the expectation value in (5) satisfies tr{Âρ} ≈ tr{Âρ diag } for the overwhelming majority of times (allowing for some sporadic revivals) larger than the equilibration time t eq . Notice that the diagonal matrix, after that, does not yet need to be a (micro-canonical, canonical or grand-canonical) thermal density matrix. In any case, we get macroscopically identical properties of the environments, A ρ 1 (t) ≈ A ρ 0 (t) . This corresponds to condition C2, where the environments are indistinguishable but not thermal yet. Again, a lot of experimental work on such quantum equilibration scenarios is present in the literature [64][65][66][67]. Whenρ diag is roughly equal to the micro-canonical (canonical or grand-canonical in open systems) density matrix, then the quantum system is said to be thermalized. This corresponds to condition C3.
In the literature [64][65][66][67][68][69], one can find equilibration times t eq ranging from few femtoseconds to picoseconds, depending on the details and complexity of the systems at hand. If we define θ n,α as the phase of c n,α , a simple (but not rigorous) estimation of t eq can be obtained by noting that at t = 0 all phases of the off-diagonal elements (coherences) satisfy e i(θ n,α −θ m,α ) e i(E n −E m )0/h = e i(θ n,α −θ m,α ) , so that all phases of the off-diagonal elements together perfectly keep memory of the initial sate Ψ α (x, y, 0). To forget such memory, we require that the sum of all coherences in (5) vanishes after an equilibration time t eq . If such equilibration occurs, all relevant phases e i(E n −E m )t eq /h have to reach a value equal or larger than 2π to ensure that e i(θ n,α −θ m,α ) e i(E n −E m )t eq /h are randomly distributed. If we define ∆E eq = min(E n − E m ) of all relevant energies of the system, a simple estimate of the equilibration time is given by where h is the Planck constant. For a reservoir of length L = 100 nm, with a parabolic relation between energy and momentum, we can estimate a minimal energy gap between energy eigenstates equal to ∆E ≈ 10 −3 or 10 −4 eV. If we use ∆E ≈ 10 −4 eV in expression (6), we get an approximate value of the equilibration time t eq ≈ 1 ps. Even though the formula (6) is not rigorous at all, it clarifies that process of thermalization, in our case changing from different macroscopic properties A ρ 1 (0) = A ρ 0 (0) to identical macroscopic properties A ρ 1 (t eq ) = A ρ 0 (t eq ) , cannot be instantaneous but requires a time to occur. This conclusion can alternatively be reached from the definitions of Markovian and non-Markovian open quantum systems [50]. An open quantum system interacting with an environment is, in principle, a non-Markovian system. The evolution of the system (together with the environment) can only be considered Markovian if we consider the evolution in (coarse-grained) time steps larger than the time interval needed for the environment to relax. Thus, the transition from non-Markovian to Markovian relaxation time also requires a time related to the relaxation of the environment. In conclusion, even in typical environments where the assumption of thermalization is reasonable, we cannot have an instantaneous thermalization process. This delay in the thermalization provides an unquestionable limit on the speed of computations to satisfy the basic assumption of the strong (original) Landauer limit. This limit is also shown in Fig. 2. Beyond THz frequencies, the assumption that environments are always macroscopically identical to a thermal bath is not admissible and the original Landauer dissipation seems not applicable.

Conclusions
After more than 60 years, the Landauer's erasure principle is still accompanied by controversies. In this regard, Landauer himself wrote [81]. "The path to understanding in science is often difficult. If it were otherwise, we would not be needed. This field [fundamental physical limits of information handling], however, seems to have suffered from an unusually convoluted path." What we find especially unfortunate during the recent developments in this field is linking the result of the dissipation in computing gates to equilibrium thermodynamics [82]. This link is unfortunate because it is not only unnecessary (as we have seen in our paper), but it has the undesired effect of unnecessarily limiting the imagination of many researchers. An exception that has overcome this limitation has recently been published in Ref. [83], where erasure gates using squeezed thermal environments are proposed. Thus, at first sight, it seems that any attempt to discuss possible extensions of the Landauer's erasure principle beyond thermodynamic equilibrium requires the flexible tools of non-equilibrium thermodynamics. Such nonequilibrium tools will certainly still require some notion of equilibrium to be able to define what is heat, work, etc. As we mentioned, this is the typical path followed for most investigations on Landauer's extensions. But, this is not the path we have followed in our paper. Can we use a description of the erasure process based exclusively on the mechanical (not thermodynamic) laws of physics? Yes, of course. An erasure gate is, at the end of the day, a physical system whose performance follows the fundamental microscopic laws of physics. As an example, in appendix B, we have shown a toy model of an erasure gate whose performance during several repetitions is evaluated by numerically solving the fundamental microscopic laws that simultaneously govern the degrees of freedom of the system and the environment.
The reader can (erroneously) argue that we have used some thermodynamics concepts, not only microscopic laws, along the paper because we have included entropy argumentations. We have only used a definition of entropy as the number of microstates that are present in a given macrostate. By construction, such concept of entropy is perfectly adequate in a microscopic description of any system (independently on whether it is used in thermodynamic discussions too). It only requires the proper definition of a macroscopic state in terms of microscopic states, as we have done in Section 2. Then, of course, in subsection 3.3 we have invoked the equilibrium thermodynamics concepts of heat and temperature, but only to reach the original Landauer formulation, which is nothing but a special case of our general formulation.
The main advantage (and drawback) of our paper is that it uses classical microscopic physics. As such, it provides a mathematically simple and physically rigorous understanding of the three types of Landauer's erasure principle. But, strictly speaking, the results of this paper have not been demonstrated to be valid in quantum scenarios. A rigorous quantum extension of the classical microscopic explanation presented here is far from the scope of this paper. The main reason is because there is still a strong disagreement in the scientific community on how to define a quantum microscopic state (if it exists at all). In fact, even the wave function (linked to any definition of a microstate) is under a lively debate now (does it represent only epistemic knowledge about the outcomes of future measurements? or, is it something ontologically real ?) [62]. Even, it is not clear if the wave function is enough to define a microscopic state, since it is also argued that present quantum theory has to be understood as something emergent; as an average description of an underlying more complicated quantum dynamics (with additional microscopic variables) [62]. Despite this poor understanding of what quantum microscopic states are, in Section 4 we have provided some quite general evidences that it is reasonable to expect that the classical results presented here do also apply in a quantum regime. Basically, even under the assumption that a quantum environment will effectively reach some type of equilibrium (whatever it means), some time will be needed to reach it. In addition, in the appendix A, after selecting a particular interpretation of quantum mechanics, we also provided a natural extension of the classical results of the main text to the quantum regime.
Finally, let us mention that the strong Landauer's erasure principle has not been relevant yet for practical devices because nowadays other larger sources of dissipation are present. It seems reasonable to expect that in the future, when the other sources of dissipation disappear, the strong Landauer's erasure principle will still not be relevant because future computing devices will work at frequencies for which the assumption of environment in (classical or quantum) thermodynamic equilibrium will no longer be valid as shown in the shaded region in Fig. 2.
We hope that the present work will help to develop new research avenues for engineering computing devices with environments that satisfy condition C2 involving entropy change without heat dissipation, or even approaching condition C1 where the entropy change can be reduced significantly.

Appendix A Counting the number of microstates
In this appendix A, in the first subsection, we will show that the concept of entropy is a way to quantify the number of microstates that belong to a given macrostate. Then, in the second subsection, we will show that the development done in the manuscript in terms of well-defined trajectories can be extended into the quantum regime in a very simple and natural way by using quantum (Bohmian) trajectories.

A.1 Classical procedure
We consider a classical system in an experiment described by the trajectory x (j) (t), y (j) (t) that belongs to the macroscopic state A. Then, the entropy [84] is defined as where k B is the Boltzmann constant and V A is the volume in the phase space, definition 4, for all M points of the phase space that look macroscopically similar to x (j) (t), y (j) (t), that is all the phase-space points of the macroscopic state A of the macroscopic property A, according to definition 3. In Fig. A1, we have represented a L x × L y region of the phase space Γ and the microscopic points of the macroscopic state M A at the initial t i and final t f times. We have drawn 25 cells of area ∆Γ = L x /5 × L y /5 = L x × L y /25 in Figs. A1(a) and (b). It seems from these plots that phase-space volume of M is not proportional to the number of phase-space points (because it also depends on the size of the cells). However, in Figs. A1(c) and (d) we select 225 smaller cells with an area ∆Γ ′ = L x /15 × L y /15 = L x × L y /225 so that each cell accommodates only one (or zero) point. Now, the phase-space volume becomes proportional to the ∆Γ ′ . Thus, we can define the entropy for the M A = 10 points at the initial time t i as Notice that the use of a smaller area than ∆Γ ′ in (A2) (for example, ∆Γ ′′ so that each cell still contain zero or one microstate) will only modify the last constant, k B ln ∆Γ ′ → k B ln ∆Γ ′′ , which is irrelevant when evaluating the entropy change between two times (as far as both use the same cell grid). Similarly, increasing the number of points will only modify the same irrelevant constant (as far as each cell accommodates only one or zero points). Thus, as it is well-known, the entropy linked to the macrostate A can be computed from the number of microstates M A that belong to such macrostate.

A.2 Quantum procedure
As indicated in the manuscript, identifying the microscopic properties of a quantum systems is a rather subtle and controversial issue, because it highly depends on interpretation of quantum mechanics, on which there is no consensus among physicists [62]. For example, in the standard interpretation of quantum mechanics, it is the many-body wave function in the configuration space that provides such microscopic description, while other interpretations may postulate additional variables or replace the wave function with an entirely different object. Since each theory provides also its equation of motion for its microscopic state, the diversity of quantum theories implies no consensus on the behavior of the quantum microscopic world. Some interpretations of quantum mechanics claim that the quantum laws for such microscopic states are neither deterministic nor time-reversible. By contrast, other interpretations claim that the microscopic quantum laws are deterministic and reversible, while indeterminism and irreversibility emerge only at the macroscopic level. At this point, a quantum procedure to count the number of microstates requires specifying which quantum theory is used to interpret quantum phenomena. The authors of this paper believe that the conceptually clearest way to think about quantum physics, in general, and on quantum microscopic states, in particular, is to use the Bohmian interpretation of quantum mechanics [85][86][87][88], in which the differences between classical and quantum microscopic physics look less radical than in other interpretations. We choose in this appendix A this interpretation to explicitly show that all developments done in Secs. 2 and 3 for classical systems can be straightforwardly extended to the quantum regime. The fundamental elements of the Bohmian theory are the many-body wave function Ψ(x, y, t) of a closed quantum system, together with the actual particle positions x (j) (t) of the system and actual particle positions y y x y x y x x Fig. A1 Schematic representation of the initial (left panels) and final (right panels) microscopic states (dark blue and red solid circles) and the volumes of macrostates (light blue and red regions) in the system x plus environment y phase space (or in the system x plus environment y configuration space for a quantum system). The upper and lower panels describe the coarse and fine graining, respectively, of the same initial and final microscopic states, explaining why the volume of a macrostate can be quantified by just counting the number of microscopic states. In the upper panels a large grid is assumed so that the space is divided into cells. In (a) the number of occupied cells is 5 and the number of microstates is M A = 10. In (b) the number of occupied cells is 8 and the number of microstates is M A = 10. Since the Liouville theorem ensures that V A in (a) has to be equal to v B in (b), from 5 = 8 we see that such a grid does not allow us to identify the volume of the macrostate with the number of microstates. In the lower panels a smaller grid is chosen, by dividing the phase space into smaller cells so that only one microstate (or none) occupies each cell. In (c) and (d) the number of cells is 10 and the number of microstates is M A = 10. For such a grid, identifying the volume of the macrostate with the number of microstates is correct. If more microstates belonging to the macroscopic property A are considered in the discussion, we can use a grid with even smaller cells until we satisfy again the requirement that only one microstate (or none) occupies each cell. y (j) (t) of the environment [85,87,88]. The evolution in time of such positions x (j) (t), y (j) (t) represents a trajectory in the configuration space. The many-body wave function Ψ(x, y, t) guides the particles by determining their velocities [85][86][87]. In the laboratory, at the initial time t = 0, one can prepare the wave function Ψ(x, y, 0), but one cannot prepare the initial positions [89]. Such initial positions x (j) (0) and y (j) (0) obey the probability distribution |Ψ(x, y, 0)| 2 when the identical experiment is repeated M → ∞ times, where the superindex j labels each experiment j = 1, .., M. Thus, once x (j) (0), y (j) (0) and |Ψ(x, y, 0)| 2 are fixed, the Bohmian interpretation of quantum phenomena is deterministic and time-reversible at the microscopic (ontological) level, just like classical physics. However, since we have no direct control over x (j) (0) and y (j) (0), the Bohmian results are random at the empirical level [86][87][88]. As indicated in Sec. 4, we are using the variables x and y in the quantum regime as a point in the configuration space, rather than as a point in the phase-space as in Secs. 2 and 3 for classical particles. This is the only difference between the quantum (Bohmian) and classical description of microstates. We have seen in the first part of this appendix A that such difference becomes irrelevant when discussing the three types of Landauer's erasure principle explained in the manuscript. Therefore, the use of the same notation for quantum and classical versions will help in straightforwardly reusing the classical results for quantum systems.
First of all, if we want to count quantum microstates, we have to specify in detail what is the definition of a microscopic state in the Bohmian theory. At first sight, it seems that a microscopic state is determined by trajectory x (j) (t), y (j) (t) of the j-th experiment plus the wave function Ψ(x, y, t) that guides such trajectories. But, in fact, the Bohmian theory is a holistic theory in the sense that its applicable to the whole Universe, not only to a part of it [87][88][89]. Therefore, strictly speaking, there is just one wave function for describing any process in the Universe. For our practical example of an erasure gate, it means that the wave functions Ψ 1 (x, y, t) for the input logical state 1 and Ψ 0 (x, y, t) for the input logical state 0 used in Sec. 4 can be substituted by a unique big wave function of a closed system. This big wave function appears in a natural way in Bohmian mechanics by taking into account the rest of degrees of freedom of the laboratory, labeled here as z, that determine whether we are a dealing with a initial logical 1 or 0. Such big wave function is written as Ψ(x, y, z, t), and we can define the wave function for 1 as the conditional (Bohmian) wave functions [86,88,89] of the big wave function as Ψ 1 (x, y, t) ≡ Ψ(x, y, z j 1 (t), t). Identically, we define the wave function describing the initial logical 0 as Ψ 0 (x, y, t) ≡ Ψ(x, y, z j 0 (t), t).
Here, z j 1 (t) can be any of the configurations of the positions z of the set-up of the laboratory (excluding the environment and the system) that are linked to an initial logical state 1. Likewise, z j 0 (t) for 0. The final result is that a quantum (Bohmian) microstate is defined just by x (j) (t), y (j) (t), z (j) (t) alone. We do not need to include the wave function in our attempt to count quantum microstate because, despite the wave function is a fundamental ontological element of the Bohmian theory, it is the same for all experiments of the erasure gate. By including z (j) (t) as part of the degrees of freedom of the environment, the quantum (Bohmian) and classical definition of a microstate are almost identical (the first is a point in the configuration space, while the second a point in the phase space). Once this small difference is accounted for, all definitions done in Secs. 2 and 3 for counting the number of classical microstates can be reused for getting identical conclusions in the quantum regime. For example, the concept of entropy in the quantum case just requires reinterpreting the phase-space axes in the figures (like Fig. A1) as the axes in the configuration space. Such a change of axes is inessential because we have demonstrated in the first part of this appendix A that what matters in the discussion of entropy is just the number of microstates that belong to each macrostate. Importantly, the non-crossing property of the classical trajectories in the phase-space in proposition 2 is also satisfied by the quantum (Bohmian) trajectories in the configuration space. Since all experiments deal with a unique wave function Ψ(x, y, z, t) which is a single-valued function, so only one velocity at position x, y, z and time t is possible. This implies that Bohmian trajectories cannot cross in the configuration space [86,88]).

Appendix B Toy model for the weak Landauer's erasure principle
In this appendix B, we describe a toy model of an erasure gate that satisfies the weak Landauer's erasure principle (condition C1). The model does not pretend to be realistic, but just exemplify the physical soundness of this weak version. We consider a 2D system where each particle position has a horizontal r x and vertical r y location. Notice that, in this appendix, we use x and y as directions in physical space which is different from the meaning assigned to them in main part of the paper. The number of particles of the environment, N E = 28, can be separated into two sets. The first set is formed by the 14 particles indicated by the red solid circles in the top of Figs. B2 and B3. The second set of 14 particles is indicated by red solid circles in the bottom of Figs. B2 and B3. The first set can be identified as a "top barrier" , while the second as a "bottom barrier". Both "barriers" will guide the system particle from the input toward the output of the gate. In order to minimize the movements of the particles of the environment for the reasons that will become clear later, each particle in the environment has a mass of m i = 4000m with m the free electron mass and a charge q i = 0.04 q with q the electron charge (with sign). The initial velocity of the particles of the environment is zero in both directions, v x,i (0) = 0 m/s and v y,i (0) = 0 m/s.
In fact, since we are interested in using the same erasure gate several times, there is just one environment (N E = 28 particles) involved all the time, but several systems. We consider the system as a single particle that enters into the gate at r x = 0 nm encoding an input logical value , travels inside, and exits the gate at r x = 11 nm encoding the output logical value. Using another time, the same gate implies using the same environment but a new degree of freedom for the system encoding the new logical information. To avoid a complicated notation, since only one system particle is interacting with the environment at each operation, we will assume that each system particle is described by a degree of freedom label as n = 29. The mass of the particle of the system is m 29 = 0.2m and its charge q 29 = 10q.
Then, the dynamics of all particles (environment plus system), at each operation, is determined by the following Hamiltonian.
with ǫ the vacuum permittivity and p x,i = v x,i m i and p y,i = v y,i m i the momentum components of the i-th particle in the horizontal and vertical directions, respectively. Identically, r x,i and r y,i are the position components of the i-th particle in the horizontal and vertical directions, respectively. Notice that the "top barrier" and "bottom barrier" are not external potentials, but just particles interacting with the system particle. The numerical solution of the interacting N = 29 particles is done by time-integrating the acceleration, computed from Newton's laws, with a temporal step of 1 · 10 −18 s. Red solid circles denote original initial positions of the N E = 28 particles of the environment during the first operation. Red solid squares denote the original trajectory of the single particle of the system, from the input r x = 0 to the output r x = 11, plotted every 0.2 fs . During the first 0 → 0 operation, each particle of the environment interacts with the system particle and with the other particles and modifies its position and velocity. When the first system particle leaves the gate, a new system particle enters into the gate, and a repetition of the 0 → 0 logical operation is done with the initial conditions of the environment particles in this second operation equal to their final conditions in the first operation. No reset of the environment is considered after each operation. The initial positions of the particles of the environment at the 10, 20 and 30 repetitions are indicated with empty circles with different colors. The same colors are used to represent the corresponding trajectory of the system at these repetition with empty squares.
In Fig. B2, apart from the initial positions of the environment in solid red circles, we plot the trajectory of the system particle in solid red squares from the input of the gate (r x = 0 nm) till the output (r x = 11 nm). The initial state of the system corresponds to the logical 0 described as {r x,29 (0) = 0 nm, r y,29 (0) = 3 nm} and initial velocity v x,29 (0) = 2 · 10 6 m/s in the horizontal direction and zero v y,29 (0) = 0 m/s in the vertical direction.The particle of the system is initially repelled by the "bottom barrier", and later by the "top barrier". Finally, at the horizontal position of r x = 11 nm, the system indicates the final logical value 0. As a consequence of the interactions given by (B3), the particles of the environment have slightly modified their initial positions and velocities.
As it happens in a real gate, after the first operation, the gate is ready for a second operation. Such second operation happens when another particle of the system is prepared identically to the first system particle, indicating the initial logical 0. However, now the environment particles are not the solid red circles in Fig. B2, but the new initial conditions of the particles of the environment in this second operation correspond to the final conditions of the environment after the first operation. The environment of the second operation is different from the environment of the first one. This is exactly what happens in the strong Landauer's erasure principle. Each time an operation takes place, the final environment becomes hotter than before the operation. As far as the hotter final environment is not much different from the initial one, no reset of the environment degrees of freedom is needed in real erasure gates.
Since we are also interested in avoiding the reset of the environment in our toy model of the weak Landauer's erasure principle, we want an environment that suffers small perturbation. Now, it becomes evident why we select such heavy particles (m i = 4000m) with such small charge (q i = 0.04 q) in the particles of the environment. In Fig. B2 we have plotted in circles of different colors the final positions of the particles of the environment at different repetitions of the operation (without a reset in the environment). We also plot the trajectories of the system with squares and with the same colors that we used for the environments. In Fig. B3, we have repeated the results of In Fig. B2, but now considering that the system corresponds to a logical 1 defined as {r x,29 (0) = 0 nm, r y,29 (0) = 7 nm} with the same initial velocity of the system particle as before: v x,29 (0) = 2 · 10 6 m/s in the horizontal direction and zero v y,29 (0) = 0 m/s in the vertical direction. The system particle is now initially repelled by the "top barrier", and later by the "bottom barrier". Finally, at the horizontal position of r x = 11 nm, the system's microscopic state corresponds to the final logical state 0. As a consequence of the interactions given by (B3), the particles of the environment in Fig. B3 have a stronger modification of their initial positions and velocities than in Fig. B2. Roughly speaking, it is more "difficult" to convert the initial 1 into a final 0 in Fig. B3, than to keep the initial 0 into a final 0 in Fig. B2. The perturbation of the conditions of the bottom particles of the environment around the positions r x = 8 nm and r x = 9 nm is remarkable. The same happens to the top particles of the environment around position r x = 12 nm. In any case, we see that the environment without reset is able to repeat the operation 1 → 0 correctly for more than 30 times.  Fig B3. The final positions and velocities of each particle of the environment after finalizing one operation are the initial conditions of that particle in the next operation. At the beginning of each operation, the system particle has a velocity v x,29 = 2e 6 m/s and v y,29 = 0. Its initial positions are r x = 0 nm and r y = 3 nm for defining a 0 and r x = 0 nm and r y = 7 nm for defining a 1. The operation 1 → 0 provokes a greater detectable perturbation of the environment momentum than the operation 0 → 0.
In Fig. B4, we plot the initial and final total momentum of all (N = 29) particles in the y and x direction for each of the 37 repetitions, for the 1 → 0 and 0 → 0 operations. The first time that an operation takes place, the total momentum in the y direction is zero because none of the particles have velocity in the y direction. The total momentum in the x direction coincides with the initial velocity of the particle of the system (all particles of the environment have zero initial velocity in the x direction). Of course, the final momentum after the operation coincides with the initial one because of the conservation of the total momentum in a closed system. Because of the interactions between particles dictated by (B3), during the operation, different particles of the environment acquire different velocities. Thus, in the second operation, the system has again the initial velocity v x,29 = 2 · 10 6 m/s, while the environment particles have the initial conditions at the second operation that correspond to the final conditions of the first operation. A new redistribution of the total momentum happens during the second operation again. All subsequent operations have a similar behavior.
We clearly see in Fig. B4 that the redistribution of momentum in the 1 → 0 is different than in the 0 → 0 operations. In fact, by just looking at the evolution of the momentum of the environment, without discussing the system dynamics, one can identify which logical operation has occurred in the gate. The environment slightly modifies during each operation, but the modification of the environment during the operation 0 → 0 is different from modification of the environment during the operation 1 → 0. In particular, p x is negative in 1 → 0 (green triangle and black circle in Fig. B4), while p x is positive in 0 → 0 (blue circle and red triangle in Fig. B4).
As said in Sec 3.1, these differences between the environments seen in Fig.  B4 are exactly what we meant by Condition C1 when describing "environments with different final macroscopic properties". Of course, such definition can seem a bit ambiguous. Another way of saying the same is that the phase space of the points belonging to the final state of the gate defined by (B3), when dealing with operations 1 → 0 and 0 → 0, is much more similar to the phase space of Fig. 3 than to the phase space of Fig. 4 in the main part of the paper. The later requires a type of chaotic or thermalized behavior of the environment (which can be typical of thermal reservoirs), which is different from the well-defined behavior of the particles of the environment of the toy model that we have worked in this appendix. Again, we conclude with one of the main messages of this paper: there is no need to assume only chaotic (thermal) environments, since the type of the environments depicted in the phase space of Fig. 3 is also physically plausible for a logical erasure gate, as seen in Fig. B4.

Supplementary information. 'Not applicable'
macroscopic properties C1, C2 and C3 seems to be adapted to the anthropomorphic perceptions does not mean that those conditions are subjective or depending on human observations. Macroscopic conditions on physical systems are objective (physical) conditions which have to be satisfied by the microscopic evolutions. One can define these macroscopic properties as a result of a large-scale resolution of the apparatus which fixes the initial microscopic state or detects the final one. The macroscopic properties can be redefined as a particular distribution of phase space points X C (t), Y C (t). In turn, specifying such phase-space points distribution is exactly equivalent to specifying the Hamiltonian H C (x, y). In other words, the Hamiltonian H C 1 (x, y) satisfying C1 is different from the Hamiltonian H C 2 (x, y) and both are different from H C 3 (x, y). The three Hamiltonians, by construction, can be designed to satisfy the same logical input/output table, but they are physically different in the way they manipulate the environment degrees of freedom during the system plus environment interaction. Thus, each of the Hamiltonians can have a different dissipation, even if they provide the same logical table. No human perception is involved in the discussion at all.
[53] H. Goldstein, Classical mechanics (Addison-Wesley, 2002) [54] The conditions C1, C2 and C3 are basically conditions on what types of natural macroscopic properties can be expected for the final (not initial) environments. We consider that initial environments do not have any correlation/entanglement with the initial (1 or 0) logical property of the system and that they satisfy E 1→0 (t i ) = E 0→0 (t i ) for most relevant macroscopic properties. Of course, one can envision the possibility of more exotic initial environments so that exotic relations between initial and final entropies can be expected. Such engineering of the initial environment is far from the scope of this work and the spirit of the Landauer's erasure principle.
[55] When dealing with a thermal environment, it is routinely assumed that the entropy variation of the environment is given by ∆S env = ∆Q/T, so that only the entropy variation of the system needs to be evaluated. This is the procedure in many classical and quantum developments [21]. By contrast, since we are considering general (not only thermal) environments, we are discussing the variation of the Boltzmann entropy for the whole system plus environment. Our procedure for the direct evaluation of the entropy of the whole system has the additional advantage of not having to assume that the whole entropy is equal to the sum of the entropy of its parts, which is not obvious when the parts have strong correlations between them.
[56] Notice that Landauer and Bennett were right in their argumentation that an erasure gate designed to work only once is not a valid gate in the present discussion of dissipation. Our disagreement here is on the implicit assumption in Landauer's and Bennett's argumentation that condition C1 could only be satisfied for erasure gates that could work only once without reset. This last assumption is wrong because we can imagine gates that satisfy condition C1, while they work many times (without reset), as far as the initial and final environments are quite similar (but not identical). As simple toy-model to show the physical soundness of our proposal can be found in appendix B.
[57] The fact that there is no entropy limit for reversible logic with condition C2 is a well-known result and even tested experimentally [37].
[58] Notice that we can engineer systems with asymmetric phase space volumes for the initial 1 and 0 (or the associated environment phase spaces) so that, in one of the operations, the entropy change can be lower than the value predicted by Landauer as explained in Ref. [12]. In any case, these exotic results (validated experimentally in Ref. [41]) do not contradict the spirit of the original Landauer's principle. One could also envision exotic entropy relations by considering final 1 and 0 whose macroscopic system properties are different from the macroscopic system properties assigned to the initial 1 and 0.
[70] Let us clarify whether the degrees of freedom y include all the positions of the rest of the Universe or not. In principle, it seems that we would have to consider y as the rest of the Universe, but it is not needed.
The mentioned experiments [64][65][66][67][68][69] on closed quantum systems do not include the whole Universe because the time scale dictating the equilibration between the system and the nearby degrees of freedom of the environment can be much shorter than the time-scales introduced by the coupling of the system to the rest of the Universe. Thus, we have the right to discuss our system plus nearby degrees of freedom as a closed quantum system, as far as we are not looking for its behavior at very long times [63]. The whole topic of thermalization of (well-approximated) closed quantum systems [64][65][66][67][68][69] is based on this reasonable separation between a nearby environment and the rest of the Universe.
[71] In our discussion of an erasure gate with minimum dissipation, it makes no sense to consider initial states Ψ 1 (x, y, 0) and Ψ 0 (x, y, 0) with different expectation values of the total energy, because the energy is a macroscopic constant of motion that would be different from the operations 1 → 0 and 0 → 0. Obviously, in a erasure process, we do not want any macroscopic property that allow us to differentiate the final states [47]. The expectation value of the energy will be E ρ 1 (t) = ∑ n |c n,1 | 2 E n = E ρ 0 (t) = ∑ n |c n,0 | 2 E n , which does not involve the off-diagonal elements of the density matrix. Then, if we consider a large number of eigenstates with similar (but not exactly identical) energies, it seems reasonable to assume that the condition E ρ 1 (t) = E ρ 0 (t) implies that the diagonal elements of both density matrices are similar ρ 1,n,n ≈ ρ 0,n,n for all n. The differences between Ψ 1 (x, y, 0) and Ψ 0 (x, y, 0) are kept in the differences between the phases of c n,1 and c n,0 .
[72] C.F. Destefani, X. Oriols, Assessing quantum thermalization in physical and configuration spaces via many-body weak values. Physical Review A 107(1), 012,213 (2023) the logical (not physical) gate. This exact point was mentioned by Vaccaro and Barnett [59,60] when they emphasized that the costs of erasure depend on the nature of the gate and the on the reservoir (environment) with which it is coupled. Let us discuss an example of how, of course, the dissipation have to be linked to the type of physical Hamiltonian used to design the logical gate. One can easily imagine a horrible erasure gate that dissipates a lot of heat (whatever the Shannon entropy says). Such horrible erasure gates can be, for example, whatever of the green points plotted in Fig. 2 referring to computers nowadays whose dissipation is, at least, eight orders of magnitude larger than the Landauer limit. Certainly, for such horrible gates, the Shannon entropy cannot be identical to the Boltzmann entropy. Having said this, one could still argue that the Shannon entropy is related to the Boltzmann entropy under the assumption that one only considers physical gates that take the minimum value of entropy change allowed by the intermediate Landauer's principle (condition C2). Such an argument for linking Shannon and Boltzmann entropy would be correct, but it would seem as useless as saying that the Shannon entropy is equal to the Boltzmann entropy whenever such a relation is true. Our overall argumentation in this point coincides with the title of the work of Kish and Ferry [31]: "Information entropy and thermal entropy: apples and oranges".