Abstract
Key to biological success, the requisite variety that confronts an adaptive organism is the set of detectable, accessible, and controllable states in its environment. We analyze its role in the thermodynamic functioning of information ratchets—a form of autonomous Maxwellian Demon capable of exploiting fluctuations in an external information reservoir to harvest useful work from a thermal bath. This establishes a quantitative paradigm for understanding how adaptive agents leverage structured thermal environments for their own thermodynamic benefit. General ratchets behave as memoryful communication channels, interacting with their environment sequentially and storing results to an output. The bulk of thermal ratchets analyzed to date, however, assume memoryless environments that generate input signals without temporal correlations. Employing computational mechanics and a new information-processing Second Law of Thermodynamics (IPSL) we remove these restrictions, analyzing general finite-state ratchets interacting with structured environments that generate correlated input signals. On the one hand, we demonstrate that a ratchet need not have memory to exploit an uncorrelated environment. On the other, and more appropriate to biological adaptation, we show that a ratchet must have memory to most effectively leverage structure and correlation in its environment. The lesson is that to optimally harvest work a ratchet’s memory must reflect the input generator’s memory. Finally, we investigate achieving the IPSL bounds on the amount of work a ratchet can extract from its environment, discovering that finite-state, optimal ratchets are unable to reach these bounds. In contrast, we show that infinite-state ratchets can go well beyond these bounds by utilizing their own infinite “negentropy”. We conclude with an outline of the collective thermodynamics of information-ratchet swarms.
Similar content being viewed by others
Notes
Probability distributions over infinitely many degrees of freedom ultimately require a measure-theoretic treatment. This is too heavy a burden in the current context. The difficulties can be bypassed by assuming that the number of bits in the infinite information reservoir is a large, but positive finite integer L. And so, instead of infinities in \(\text {B}_{0:\infty }\) and \(\text {B}_{0:\infty /N}\) we use \(\text {B}_{0:L}\) and \(\text {B}_{0:L/N}\), respectively, and take the appropriate limit when needed.
For a somewhat similar approach see Merhav [62].
References
Shannon, C.E., McCarthy, J. (eds.): Automata Studies. Annals of Mathematical Studies, vol. 34. Princeton University Press, Princeton (1956)
Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary Time Series. Wiley, New York (1949)
Wiener, N.: Nonlinear prediction and dynamics. In: Wiener, N. (ed.) Collected Works III. MIT Press, Cambridge (1981)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(379–423), 623–656 (1948)
Shannon, C.E.: Communication theory of secrecy systems. Bell Syst. Tech. J. 28, 656–715 (1949)
Shannon, C.E.: Coding theorems for a discrete source with a fidelity criterion. IRE Natl. Convent. Rec. 7:142–163, 623–656 (1959)
Shannon, C.E.: Two-way communication channels. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 611–644. University of California Press, Berkeley (1961)
Wiener, N.: Cybernetics: Or Control and Communication in the Animal and the Machine. MIT, Cambridge (1948)
Wiener, N.: The Human Use of Human Beings: Cybernetics and Society. Da Capo Press, Cambridge (1988)
von Bertalanffy, L.: General System Theory: Foundations, Development, Applications, revised edn. Penguin University Books, New York (1969)
Ashby, W.R.: Design for a Brain: The Origin of Adaptive Behavior, 2nd edn. Chapman and Hall, New York (1960)
Quastler, H.: The status of information theory in biology—a roundtable discussion. In: Yockey, H.P. (ed.) Symposium on Information Theory in Biology, p. 399. Pergamon Press, New York (1958)
Conway, F., Siegelman, J.: Dark Hero of the Information Age: In Search of Norbert Wiener, the Father of Cybernetics. Basic Books, New York (2006)
Crutchfield, J.P.: Between order and chaos. Nat. Phys. 8(January), 17–24 (2012)
Klages, R., Just, W., Jarzynski, C. (eds.): Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond. Wiley, New York (2013)
Ashby, W.R.: An Introduction to Cybernetics, 2nd edn. Wiley, New York (1960)
Szilard, L.: On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Z. Phys. 53, 840–856 (1929)
Leff, H., Rex, A.: Maxwell’s Demon 2: Entropy, Classical and Quantum Information, Computing. Taylor and Francis, New York (2002)
Kolmogorov, A.N.: Three approaches to the concept of the amount of information. Prob. Inf. Trans. 1, 1 (1965)
Kolmogorov, A.N.: Combinatorial foundations of information theory and the calculus of probabilities. Russ. Math. Surv. 38, 29–40 (1983)
Chaitin, G.: On the length of programs for computing finite binary sequences. J. ACM 13, 145 (1966)
Vitanyi, P.M.B.: Introduction to Kolmogorov Complexity and Its Applications. ACM Press, Reading (1990)
Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106, 620–630 (1957)
Horowitz, J.M., Vaikuntanathan, S.: Nonequilibrium detailed fluctuation theorem for repeated discrete feedback. Phys. Rev. E 82, 061120 (2010)
Boyd, A.B., Crutchfield, J.P.: Demon dynamics: deterministic chaos, the Szilard map, and the intelligence of thermodynamic systems. Phys. Rev. Lett. 116, 190601 (2016)
Mandal, D., Jarzynski, C.: Work and information processing in a solvable model of Maxwell’s demon. Proc. Natl. Acad. Sci. USA 109(29), 11641–11645 (2012)
Boyd, A.B., Mandal, D., Crutchfield, J.P.: Identifying functional thermodynamics in autonomous Maxwellian ratchets. New J. Phys. 18, 023049 (2016)
Boyd, A.B., Mandal, D., Crutchfield, J.P.: Correlation-powered information engines and the thermodynamics of self-correction. Phys. Rev. E 95(1), 012152 (2017)
Landauer, R.: Irreversibility and heat generation in the computing process. IBM J. Res. Dev. 5(3), 183–191 (1961)
Bennett, C.H.: Thermodynamics of computation—a review. Int. J. Theor. Phys. 21, 905 (1982)
Deffner, S., Jarzynski, C.: Information processing and the second law of thermodynamics: an inclusive, Hamiltonian approach. Phys. Rev. X 3, 041003 (2013)
Barato, A.C., Seifert, U.: Stochastic thermodynamics with information reservoirs. Phys. Rev. E 90, 042150 (2014)
Brillouin, L.: Maxwell’s demon cannot operate: information and entropy I. J. Appl. Phys. 22, 334–337 (1951)
Bennett, C.H.: Demons, engines and the Second Law. Sci. Am. 257(5), 108–116 (1987)
Lu, Z., Mandal, D., Jarzynski, C.: Engineering Maxwell’s demon. Phys. Today 67(8), 60–61 (2014)
Barnett, N., Crutchfield, J.P.: Computational mechanics of input-output processes: structured transformations and the \(\epsilon \)-transducer. J. Stat. Phys. 161(2), 404–451 (2015)
Sagawa, T., Ueda, M.: Generalized Jarzynski equality under nonequilibrium feedback control. Phys. Rev. Lett. 104, 090602 (2010)
Merhav, N.: Sequence complexity and work extraction. J. Stat. Mech. P06037 (2015)
Mandal, D., Quan, H.T., Jarzynski, C.: Maxwell’s refrigerator: an exactly solvable model. Phys. Rev. Lett. 111, 030602 (2013)
Barato, A.C., Seifert, U.: An autonomous and reversible Maxwell’s demon. Europhys. Lett. 101, 60001 (2013)
Barato, A.C., Seifert, U.: Unifying three perspectives on information processing in stochastic thermodynamics. Phys. Rev. Lett. 112, 090601 (2014)
Kolmogorov, A.N.: Entropy per unit time as a metric invariant of automorphisms. Dokl. Akad. Nauk. SSSR 124: 754 (1959) (Russian) Math. Rev. 21: 2035b
Sinai, Ja.G.: On the notion of entropy of a dynamical system. Dokl. Akad. Nauk. SSSR 124: 768 (1959)
Crutchfield, J.P., Feldman, D.P.: Regularities unseen, randomness observed: levels of entropy convergence. CHAOS 13(1), 25–54 (2003)
Brookshear, J.G.: Theory of Computation: Formal Languages, Automata, and Complexity. Benjamin/Cummings, Redwood City (1989)
Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov models. IEEE ASSP Mag. 3: 4–16 (1986)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications. IEEE Proc. 77, 257 (1989)
Elliot, R.J., Aggoun, L., Moore, J.B.: Hidden Markov Models: Estimation and Control. Applications of Mathematics, vol. 29. Springer, New York (1995)
Ephraim, Y., Merhav, N.: Hidden markov processes. IEEE Trans. Inf. Theory 48(6), 1518–1569 (2002)
Shalizi, C.R., Crutchfield, J.P.: Computational mechanics: pattern and prediction, structure and simplicity. J. Stat. Phys. 104, 817–879 (2001)
Crutchfield, J.P.: The calculi of emergence: computation, dynamics, and induction. Physica D 75, 11–54 (1994)
Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley-Interscience, New York (2006)
del Campo, A., Goold, J., Parenostro, M.: More bang for your buck: super-adiabatic quantum engines. Sci. Rep. 4, 6208 (2014)
Minsky, M.: Computation: Finite and Infinite Machines. Prentice-Hall, Englewood Cliffs (1967)
Lewis, H.R., Papadimitriou, C.H.: Elements of the Theory of Computation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)
Kemeny, J.G., Snell, J.L.: Denumerable Markov Chains, 2nd edn. Springer, New York (1976)
Ehrenberg, M., Blomberg, C.: Thermodynamic constraints on kinetic proofreading in biosynthetic pathways. Biophys. J. 31, 333–358 (1980)
Chapman, A., Miyake, A.: How can an autonomous quantum Maxwell demon harness correlated information? arXiv:1506.09207
Cao, Y., Gong, Z., Quan, H.T.: Thermodynamics of information processing based on enzyme kinetics: an exactly solvable model of an information pump. Phys. Rev. E 91, 062117 (2015)
Phillips, R., Kondev, J., Theriot, J., Orme, N.: Physical Biology of the Cell. Garland Science, New York (2008)
Gomez-Marin, A., Parrondo, J.M.R., Van den Broeck, C.: Lower bounds on dissipation upon coarse graining. Phys. Rev. E 78, 011107 (2008)
Merhav, N.: Relations between work and entropy production for general information-driven, finite-state engines. J. Stat. Mech. Theor. Exp. 2017, 023207 (2017)
Acknowledgements
As an External Faculty member, JPC thanks the Santa Fe Institute for its hospitality during visits. This work was supported in part by FQXi Grant Number FQXi-RFP-1609 and the U. S. Army Research Laboratory and the U. S. Army Research Office under contracts W911NF-13-1-0390 and W911NF-12-1-0234.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Optimally Leveraging Memoryless Inputs
It is intuitively appealing to think that memoryless inputs are best utilized by memoryless ratchets. In other words, the optimal ratchet for a memoryless input is a memoryless ratchet. We prove the validity of this intuition in the following. We start with the expression of work production per time step:
with \(\beta =1/k_B T\). The benefit of the decomposition in the second line will be clear in the following. Let us introduce several quantities that will also be useful in the following:
For a memoryless input process, sequential inputs are statistically independent. This implies \(Y_N\) and \(X_N\) are independent, so the stationary distribution \(\pi _{x \otimes y}\) can be written as a product of marginals:
In terms of the above quantities, we can rewrite work for a memoryless input process as:
where \(D_{KL}(p \Vert p_R)\) is the relative entropy of the distribution p with respect to \(p_R\) [52]. Note that the last term in the expression vanishes, since the ratchet state distribution is the same before and after an interaction interval:
and so:
Thus, we find find the average work production to be:
Let us now use the fact that the coarse graining of any two distributions, say p and q, yields a smaller relative entropy between the two [52, 61]. In the work formula, \(p^Y\) is a coarse graining of p and \(p^Y_R\) is a coarse graining of \(p^R\), implying:
Combining the above relations, we find the inequality:
Now, the marginal transition probability \(p^Y(y,y')\) can be broken into the product of the stationary distribution over the input variable \(\pi ^Y_{y}\) and a Markov transition matrix \(M^Y_{y \rightarrow y'}\) over the input alphabet:
which for any ratchet M is:
We can treat the Markov matrix \(M^Y\) as corresponding to a ratchet in the same way as M. Note that \(M^Y\) is effectively a memoryless ratchet since we do not need to refer to the internal states of the corresponding ratchet. See Fig. 2. The resulting work production for this ratchet \(\langle W^Y \rangle \) can be expressed as:
Thus, for any memoryful ratchet driven by a memoryless input we can design a memoryless ratchet that extracts at least as much work as the memoryful ratchet.
There is, however, a small caveat. Strictly speaking, we must assume the case of binary input. This is due to the requirement that the matrix M be detailed balanced (see Sect. 2) so that the expression of work used here is appropriate. More technically, the problem is that we do not yet have a proof that if M is detailed balanced then so is \(M^Y\), a critical requirement above. In fact, there are examples where \(M^Y\) does not exhibit detailed balance. We do, however, know that \(M^Y\) is guaranteed to be detailed balanced if \(\mathcal {Y}\) is binary, since that means \(M^Y\) only has two states and all flows must be balanced. Thus, for memoryless binary input processes, we established that there is little point in using finite memoryful ratchets to extract work: memoryless ratchets extract work optimally from memoryless binary inputs.
Appendix 2: An IPSL for Information Engines
Reference [27] proposed a generalization of the Second Law of Thermodynamics to information processing systems (IPSL, Eq. 1) under the premise that the Second Law can be applied even when the thermodynamic entropy of the information bearing degrees of freedom is taken to be their Shannon information entropy. This led to a consistent prediction of the thermodynamics of information engines. It was also validated through numerical calculations. This appendix proves this assertion for the class of information engines considered here. The key idea is to use the irreversibility of the Markov chain dynamics followed by the engine and by the information bearing degrees of freedom to derive the IPSL inequality.
For the sake of presentation, we introduce new notation here. We refer to the engine as the demon \(\text {D}\), following the original motivation for information engines. We refer to the information-bearing two-state systems as the bits \(\text {B}\). According to our set up, \(\text {D}\) interacts with an infinite sequence of bits, \(\text {B}_0 \text {B}_1 \text {B}_2 \ldots \) as shown in Fig. 10. The figure also explains the connection of the current terminology to that in the main text. In particular, we show two snapshots of our setup, at times \(t = N\) and \(t = N + 1\). During that interval \(\text {D}\) interacts with bit \(\text {B}_\text {N}\) and changes it from (input) symbol \(Y_N\) to (output) symbol \(Y'_N\). The corresponding dynamics is governed by the Markov transition matrix \(M_{\text {D}\otimes \text {B}_N}\) which acts only on the joint subspace of \(\text {D}\) and \(\text {B}_N\).
Under Markov dynamics the relative entropy of the current distribution with respect to the asymptotic steady-state distribution is a monotonically decreasing function of time. We now use this property for the transition matrix \(M_{\text {D}\otimes \text {B}_N}\) to derive the IPSL. Denote the distribution of \(\text {D}\)’s states and the bits \(\text {B}\) at time t by \(P_{\text {D}\text {B}_{0:\infty }}(t)\). Here, \(\text {B}_{0:\infty }\) stands for all the information-bearing degrees of freedom.Footnote 1 The steady-state distribution corresponding to the operation of \(M_{\text {D}\otimes \text {B}_N}\) is determined via:
where \(\pi _{\text {D}\text {B}_N}^\text {eq}\) denotes the steady-state distribution:
and \(P_{\text {B}_{0:\infty /N}} (N)\) the marginal distribution of all the bits other than the N-th bit at time \(t = N\). We introduce \(\pi ^\text {s}(N)\) in Eq. (21) for brevity.
The rationale behind the righthand side of Eq. (20) is that the matrix \(M_{\text {D}\otimes \text {B}_N}\) acts only on \(\text {D}\) and \(\text {B}_N\), sending to their joint distribution to the stationary distribution \(\pi _{\text {D}\text {B}_N}^\text {eq}\) (on repeated operation), while leaving intact the marginal distribution of the rest of \(\text {B}\). The superscript \(\text {eq}\) emphasizes the fact the distribution \(\pi _{\text {D}\text {B}_N}^\text {eq}\) is an equilibrium distribution, as opposed to a nonequilibrium steady-state distribution, due to the assumed detailed-balance condition on \(M_{\text {D}\otimes \text {B}_N}\). In other words, \(\pi _{\text {D}\text {B}_N}^\text {eq}\) follows the Boltzmann distribution:
for inverse temperature \(\beta \), free energy \(F_{\text {DB}_N}\), and energy \(E_{\text {DB}_N}(x,y)\). In the current notation we express the monotonicity of relative entropy as:
where \(D(p \Vert q)\) denotes the relative entropy of the distribution p with respect to q:
over \(\text {D}\)’s states i. The IPSL is obtained as a consequence of inequality Eq. (23), as we now show.Footnote 2
First, we rewrite the lefthand side of Eq. (23) as:
The first line applies the definition of relative entropy. Here, \({\text {H}}_\text {X}\) denotes the Shannon entropy of random variable X in information units of bits (base 2). The second line employs the expression of \(\pi ^\text {s}(N)\) given in Eq. (21). The final line uses the Boltzmann form of \(\pi ^\text {eq}_{\text {DB}_N}\) given in Eq. (22). Here, \(\langle E_{\text {D}\text {B}_N}\rangle (N)\) denotes the average energy of \(\text {D}\) and the interacting bit \(\text {B}_N\) at time \(t = N\).
Second, in a similar way, we have the following expression for the righthand side of Eq. (23):
Note that the marginal distribution of the noninteracting bits \(\text {B}_{0:\infty /N}\) does not change over the time interval \(t = N\) to \(t = N + 1\) since the matrix \(M_{\text {D}\otimes \text {B}_N}\) acts only on \(\text {D}\) and \(\text {B}_N\), and the Shannon entropy of the noninteracting bits remains unchanged over the interval.
Third, combining Eqs. (23), (24), and (25), we get the inequality:
where \(\Delta {\text {H}}_{\text {DB}_{0;\infty }}\) is the change in the Shannon entropy of \(\text {D}\) and \(\text {B}\) and \(\Delta \langle E_{\text {DB}_N} \rangle \) is the change in the average energy of \(\text {D}\) and \(\text {B}\) over the interaction interval.
Fourth, according to the ratchet’s design, \(\text {D}\) and \(\text {B}\) are decoupled from the work reservoir during the interaction intervals. (The work reservoir is connected only at the end points of intervals, when one bit is replaced by another.) From the First Law of Thermodynamics, the increase in energy \(\Delta \langle E_{\text {DB}_N} \rangle \) comes from the heat reservoir. In other words, we have the relation:
where \(\Delta Q\) is the heat given to the system. (In fact, Eq. (27) is valid for each realization of the dynamics, not just on the average, since the conservation of energy holds in each realization.)
Finally, combining Eqs. (26) and (27), we get:
which is the basis of the IPSL as demonstrated in Ref. [27]; see, in particular, Eq. (A7) there.
Rights and permissions
About this article
Cite this article
Boyd, A.B., Mandal, D. & Crutchfield, J.P. Leveraging Environmental Correlations: The Thermodynamics of Requisite Variety. J Stat Phys 167, 1555–1585 (2017). https://doi.org/10.1007/s10955-017-1776-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-017-1776-0