Record dynamics of evolving metastable systems: theory and applications

Record Dynamics (RD) deals with complex systems evolving through a sequence of metastable stages. These are macroscopically distinguishable and appear stationary, except for the sudden and rapid changes, called quakes, which induce the transitions from one stage to the next. This phenomenology is well known in physics as"physical aging", but from the vantage point of RD the evolution of a class of systems of physical, biological and cultural origin is rooted in a hierarchically structured configuration space and can therefore be analyzed by similar statistical tools. This colloquium paper strives to present in a coherent fashion methods and ideas that have gradually evolved over time. To this end, it first describes the differences and similarities between RD and two widespread paradigms of complex dynamics, Self Organized Criticality and Continuous Time Random Walks. It then outlines the Poissonian nature of records events in white noise time series, and connects it to the statistics of quakes in metastable hierarchical systems, arguing that the relaxation effects of quakes can generally be described by power laws unrelated to criticality. Several different applications of RD have been developed over the years. Some of these are described, showinghe basic RD hypothesis, the log time homogeneity of quake dynamics, can be empirically verified in a given context. The discussion summarizes the paper and briefly mentions applications not discussed in detail. Finally, the outlook points to possible improvements and to new areas of research where RG could be of use.


INTRODUCTION
Simply stated, the founding axiom of equilibrium statistical physics is 'all micro-states are equally probable in a thermally isolated system'. Later coarse-graining criteria are of less sweeping generality. Mori-Zwanzig projections, for example, aim to eliminate fast variables from a set of interacting degrees of freedom (DoF), while diffusion offers a probabilistic description of the configuration space movement of particles subject to external white noise. In the same configuration space, but describing a non-equilibrium process, Continuous Time Random Walks [1,2] (CTRW) jump through a sequence of 'traps', each representing an attractor of the system's noiseless dynamics. Importantly, choosing the Probability Density Function (PDF) of the jump times to be a powerlaw allows one to describe e.g. sub-diffusive behavior. While power laws are telltale signs of complex structure, present, e.g., at the critical temperature of systems undergoing a second order phase transition, their microscopic origin is not a focus point of CTRW. Instead, they are assumed apriori, in form of the PDF. Self Organized Criticality [3,4] posits that a seemingly similar behavior is shared by certain systems which spontaneously organize into a critical-like state, with no tunable external parameters but while being slowly driven. Finally, diffusion in hierarchically organized spaces [5][6][7] generates power-law relaxation laws with no reference to equilibrium properties, e.g., critical points. In summary, there is a wide range of non-equilibrium behaviors in systems with many interacting degrees of freedom, for which there are often competing statistical descriptions that all rely on simplifying assumptions and heuristic coarse-graining tools.
blueThe statistics of extreme events is a subject with a long tradition. and many applications. Gumbel [8] famously studied the distribution of the largest out of M independent identically distributed random variables (i.i.d.). This statistics of extremes finds its place in standard text books, such as [9] and [10], leading to ongoing studies into so-called "extreme-value statistics" and their universality classes [11,12]. In [13] the focus on the average and variance of the number of records in time series of i.i.d. deviates whose distribution changes in time. The properties of the record series have thus implications regarding the series from which the records are extracted. E.g. deviations from the expected, independent production of records in a time series can be used to analyze hidden correlations. This has been applied to sports and climate data, or evolution [14][15][16]. Recent progress in record statistics and its applications is discussed in [17]. red In Record Dynamics (RD), record fluctuations of a locally stationary signal, e.g. thermal noise, trigger macroscopic changes in metastable systems. The magnitude of these changes is linked to configuration space properties, e.g. the hierarchical barrier structure of the energy landscape, rather than the size of the records. We are therefore interested in temporal properties, such as the PDF of the waiting time for the next record in time series of i.i.d., e.g. energy fluctuations in thermalized systems.
The number of records in such time series is, to a good approximation, (log)Poisson distributed, see Eq. (8) anf Fig. 3.
RD treats metastable systems evolving through a sequence of 'punctuated equilibria', a term invented by S. J. Gould [18] in a macroevolutionary context. As we have argued extensively [19], the same term also describes situations lacking time-translational invariance and is used here in this more general sense. Sudden events, or 'punctuations', which are clearly distinguishable from reversible fluctuations, lead from one metastable (pseudoequilibrium) state to the next, and control all macroscopic changes. The waiting times for these de-facto irreversible events, dubbed 'quakes', all have expectation values which are finite but increase monotonically as the system 'ages' out of its initial state. The statistics of quakes provides a coarse-grained description of the multi-scale behavior characteristic of complex systems , including the presence of power laws for one and two point moments in the range of parameters specifying the system's glassy state. The strong dynamical similarities of very different metastable systems is highlighted here by treating each of them with the same RD techniques. The reader is referred to the literature for more detailed descriptions of the cases presented.
The rest of this paper is organized as follows: Section II relates RD to Self-Organized Criticality and to diffusion in hierarchical spaces. Section III summarizes mathematical assumptions and predictions and describes the techniques used to extract the quakes and their statistics from available data. Section IV reviews a number of applications to observational, experimental and simulational data, and Section V summarizes the presentation and mentions possible RD extensions.
Finally, the figure included come from works published over many years. The notation has evolved during this period, and we hope in the reader's understanding regarding the consequent non-uniform notation, mainly in the figures and their captions

II. BACKGROUND
The configuration space of a metastable system comprises many ergodic components differing in terms of macroscopic observables [20]. Reversible fluctuations occur within the components, while transitions from one component to another require crossing sizeable free energy barriers, which may differ for a transition and its reverse. If, within a certain observation time, typical equilibrium fluctuations can trigger a transition one way but not the other, this transition is de facto irreversible. Irreversible transitions violate detailed balance and their introduction would preclude an accurate description of near-equilibrium states. In contrast, de facto irreversible transitions become reversible over time. In a system with a continuum of barrier heights, increasingly many barriers are crossed reversibly and ergodic components grow in size as the system evolves out of its initial state. This situation typically occurs whenever the initial quench generates a marginally stable, i.e., a metastable state far from equilibrium stabilized by rather small barriers.
Self Organized Criticality (SOC) and hierarchical diffusion models [6,7,21] focus on different aspects of the dynamics out of marginally stable states. Specifically, SOC describes stationary fluctuations of driven systems, while RD is inherently non-stationary. Nevertheless, RD shares conceptual aspects with both SOC and hierarchical diffusion, but unlike them uses de-facto irreversibility to treat barrier crossings as a log-Poisson process. This provides an empirically verifiable coarse-graining tool leading to analytical predictions.
Below, we briefly discuss key modeling properties of SOC and diffusion in hierarchies with focus on their relation to RD. The statistics of records is dealt with in the next section. Importantly, experimental or numerical data usually do not directly deliver a series of record events. Instead anomalous events, the quakes, must be identified which allegedly correspond to transitions to new ergodic components. The final step is to verify that these quakes are described by record statistics, in short that they constitute a Poisson process whose average has a logarithmic time dependence. Several examples are given in Section IV. A.
SOC & RD Self Organized Criticality (SOC) is a coarse grained description of slowly driven, spatially extended systems with a large number of interacting degrees of freedom, e.g., famously, the sand pile model [3]. In a SOC envisaged sandpile, grains of sand are added, one at the time, eventually leading to a metastable situation when the local slope of the pile exceeds its angle of repose. At some later time, an avalanche occurs to restore the stability, or, rather, the marginal stability of the pile. SOC macroscopic dynamics is thus described by avalanches, which expectedly have a broad distribution of sizes and of inter-event times. The metastable states are highly susceptible, similar to critical states, in that a small perturbation, i.e., adding a grain of sand, can elicit a large response, like an avalanche.
This section will not dwell on the large literature generated by the SOC paradigm, see [22] and references therein, but will instead take one step back in time to highlight the origin of some key ideas that SOC and RD do share. Our starting point, to which we refer for more details, is a paper by Tang, Wiesenfeld, Bak, Coppersmith and Littlewood [23], for short TWBCL, predating the sand pile model [3], The TWBCL model considers a linear array of displacements y j of N identical masses, each elastically coupled to its nearest neighbors and subject to friction. The array moves in a sinusoidal potential, driven by a train of square wave pulses. In the limit of weak elastic couplings and high amplitude of the square wave pulses, the periodic solutions of the problem are characterized by the curvature C(n) of the array y(n) at each (integer) time n right before the n + 1 pulse begins. In a metastable state, C does not depend on n. As TWBCL point out, all metastable states lie in an N -dimensional hypercube, centered at the origin. A randomly chosen initial state will move toward the origin, but will stick to the surface of the hypercube, once it gets there. Importantly, initial configurations sufficiently far from the surface invariably end sticking at the corners of the hypercube. The states picked are minimally stable, and are akin to the metastable states of the sandpile. Since the dynamics is purely deterministic, stable states in the interior of the hypercube are never accessed. Adding white noise to the dynamics allows the system to relax logarithmically toward the center [21].
In terms of concepts laid out by TWBCL, the difference between SOC and RD is that the former considers jumps between corner states, induced by external driving but free of internal fluctuations, while the latter describes the slow aging process which follows from random fluctuations by coupling to a heat-bath or some other form of external noise. TWBCL highlights the presence of hierarchies of attractors and that processes not described by equilibrium considerations are key elements of the dynamics. Ref. [24] on 'noise adaptation' uses a simple model with a hierarchy of attractors of different stability to show how white noise of bounded amplitude selects marginally stable attractors which cannot be escaped without infinitesimally increasing the noise amplitude. If the latter does not have bounded support, e.g. for Gaussian noise, this translates into an increasing stability of marginally stable attractors, which is a feature of RD.

B. Hierarchies & RD
The connection between complexity and hierarchies was emphasized long ago by Herbert A. Simon [25] in his seminal essay 'The Architecture of Complexity'. Simon defines a hierarchy as a system composed of subsystems, each again composed of subsystems, etc , as in a set of russian Matroska dolls. Many physical, biological and social systems conform to this paradigm. In any case, a process initiated within a sub-unit must, in the short run, remain confined to that sub-unit, and only in the long run it is able to affect the whole structure. The dynamics is then nearly decomposable. Furthermore, the hierarchical structure of the system implies that the propagation of an initially localized perturbation is a decelerating multiscaled relaxation process.
As shown in Section IV C, the hierarchical nature of complex relaxation can already be inferred from cleverly designed experiments. It can, however, also be directly ascertained from studying numerical models: Using the lid method [26], all microscopic configurations of a discrete system whose energies lie between a local energy minimum and a preset 'lid' energy value are enumerated. The master equation for the system can then be set up and solved numerically, obtaining for each time t the probability density P (t, x|x 0 ) of finding the system in a micro-state x, given a sharp initial condition at x 0 . Two micro-states x and y are considered to be locally equilibrated, if P (t,x) P (t,y) ≈ exp(E(x)/T ) exp(E(y)/T ) , where near equality is controlled by a small allowed deviation. All micro-states can then be grouped into classes. Since the system will eventually equilibrate, all states will eventually end up in the same class, and a merging of different classes can be expected during the time evolution of the system.  [27], shows what happens when the procedure is applied to a small instance of the Traveling Salesman Problem, where the tour length plays the role of energy. Equilibration proceeds in stages, with different classes merging, but never splitting again. In the resulting relaxation tree, branches merge at times nearly equally spaced on a logarithmic scale. Considering the thermally activated nature of the process, this indicates that the energy barriers allowing different quasiequilibrium states to merge are equally spaced on a linear scale.
To describe thermal relaxation dynamics, the configuration space of complex systems can be coarse-grained in a tree graph [6] of the kind illustrated by Fig. 1 In RD terminology, the stepwise process of relaxation in such a hierarchy is controlled by a series of quakes, each associated to the crossing of a record high barrier, and each giving access to a hitherto unexplored region of configuration space. If the barriers are uniformly distributed on the energy axis, the quakes are uniformly distributed on a logarithmic time scale. The simplified picture sketched above neglects the discrete nature of the energy barrier structure but ensures that quaking is, as assumed by RD, a Poisson process whose average is proportional to the logarithm of time, for short, a log-Poisson process.
An alternative description of the nature of the processes described with RD is facilitated in terms of (free-)energy landscapes. As such landscapes are widely used to conceptualize disordered materials, it illustrates further the wide applicability of RD. A relaxing system is characterized by growing domains, however slowly, of correlated behavior between DoF. For domains to grow (in an otherwise finite system), others have to disappear, a process referred to as "coarsening". To overturn such a domain in a complex system requires crossing an energy barrier. It is known [29] that there are only two generic scenarios: either barrier energies grow with the size of Ref. [27] shows the hierarchical structure of relaxation at fixed temperature in the state space of a Traveling Salesman Problem. Each leaf of the tree represents a set of states which are in local equilibrium with one another at a certain level of approximation ǫ. The insert shows how a leaf splits into a whole tree when a smaller value of ǫ is chosen. The vertical axis of the tree is proportional to the logarithm of time: thus the system undergoes a series of local equilibrations (= merging of subtrees ) at times almost equally spaced on a logarithmic scale.
3 F Figure 2. Sketch of the hierarchical (free-)energy landscape of a complex system in RD, from Ref. [28], with a typical trajectory of an aging dynamics (blue and red). With increasing free energy F , local minima proliferate exponentially but also become shallower. After a quench (red-dashed arrow), the dynamics evolves through a sequence of quasi-equilibrium explorations (blue) and intermittent, irreversible quakes over record barriers (red) that access an exponentially expanding portion of configuration space (black-dashed arrow).
the domain to be overturned, or they are independent.
In the latter case, average sizes of the remaining domains grow as a powerlaw with time, ∼ t 1/2 , but if bigger domains cost ever more energy to overturn, their growth is merely logarithmic with time, ∼ ln t. While such growth may be hard to observe on any experimental scale, it still enforces the following feedback loop: The landscape of such a complex system has a hierarchical structure in that, the lower energies (i.e., larger domains) are reached, the higher the barriers get, and thus, larger fluctuations are needed to escape local minima [28]. Such a feedback is absent in the coarsening of, e.g., an Ising ferromagnet, where barriers remain independent of domain size, thus, independent of the depth within the landscape. (If entropic effects become relevant [29], like for a structural glass, the same argeument holds for free-energy barriers.) Hence, RD presupposes three features that are wellestablished for hierarchical energy landscapes: (1) Metastable states, and thus their combined basin of attraction, proliferate exponentially for increasing free energy [30][31][32][33]; (2) more-stable (lower-energy) states typically have higher barriers against escaping their basin [29,33,34]; and (3) higher jumps in energy make exponentially more configurations accessible [7,35]. Such a landscape is illustrated (as a projection of a configuration space with a large number of DoF) in Fig. 2. In a quench, the system almost certainly gets stuck initially in one of the many shallow basin of high energy. There, a small, random fluctuation already suffices to escape into a larger basin containing many sub-basins, some of which feature local minima of lower energy. But for reaching basins of ever lower energy, gradually, higher fluctuations are required to escape. The gain in stability acquired in any one of these escapes will most probable only be marginal, since spontaneously finding minima of dramatically lower energy would be exponentially unlikely. The motion in and out of states less stable than the current basin only provides reversible quasi-equilibrium fluctuations. For an irreversible quake, typically, only a rare record fluctuation in the noise impinging on the system will suffice [24], whether it is in a spin glass [36,37], in a colloidal glass [28,38,39], or even in a athermal (tapped) granular pile [40]. RD, hence, offers an analytic in-road to coarse-grained descriptions that bridge the aging phenomenology. This approximation supersedes the particular properties of the hierarchy of barriers in a given system, e.g. small differences in energy or density, or some microscopic details. As long as such a hierarchy exists [7,34,[41][42][43][44][45][46][47], i.e., the system is actually jammed, microscopic distinctions only vary by an overall unit of time.
The examples presented in the following show how the toy pictures just discussed must be nuanced. A hierarchical barrier structure captures nevertheless metastability and is thus a prerequisite for RD to be applicable.

III. MATHEMATICAL BACKBONE
A stationary time series of e.g. measured temperature values in a thermostated system, qualifies as white noise if successive measurements are statistically independent, which can be achieved by taking them sufficiently far apart. As will become apparent, the statistics of records is independent of the mechanism generating the noise. In particular it is independent of temperature in the case of temperature records in thermal fluctuations. red The universality of record statistics lies behind the shared phenomenology of microscopically very different metastable systems which experience punctuated equilibria. Below the mathematical properties of record statistics are highlighted which can be easily extracted from time series of experimental, observational or simulational data.
Consider a stationary series S k , k = 0, 1 . . . of statistically independent and identically distributed scalar random variables, so-called white noise. magenta We assume that the distribution is not supported on a finite set. By definition, the first entry of the series is a record. Other entries are records if and only if they are larger than the previous record. Memory of each record is maintained by an algorithm, and the issue of how record fluctuations can leave an indelible mark in the evolution of a physical system does not arise in this connection.
Record-sized entries within form a sub-series R ⊂ S whose interesting statistical properties follow from simple arguments [24,[48][49][50]: As it gets increasingly harder for an entry to qualify as the next record, records appear at a decreasing rate and the sub-series of record-sized entries is not stationary. The number of records in the time interval (1, t) is to a good approximation a Poisson process with average ln t. Equivalently, if the k'th record appears at times t k , the ratios ln(t k /t k−1 ) are independent random variables with an exponential distribution. The above properties are derived using heuristic approximations, and their validity is checked numerically.

A.
Record statistics in white noise Denote by p n (l, [0, m 1 . . . m n ]) the probability that n records are located at entries 0, m 1 , . . . , m n of the series S, and let P n (l) be the probability that n out of the l + 1 entries are records, regardless of their location. Clearly, in the case n = 1, the first entry must be the largest. As the largest entry can be anywhere with equal probability, is the probability of finding precisely one record in the first l + 1 elements of the series. For two records, one occurring at 0 and the other at m 1 ≤ l, we find The first factor on the r.h.s. of the equation is the probability that the zero' th element is the largest of the first m 1 entries, and the second is the probability that the largest element of the whole series be located at m 1 .
The probability that exactly two out of l + 1 elements are records is then where the harmonic number H l = l k=1 1/k satisfies with γ = 0.57721 . . . being the Euler-Mascheroni constant. Turning now to an arbitrary n, we find, by the same arguments, To obtain P n (l), the distribution p n (l, [0, m 1 , m 2 , . . . m n−1 ]) must be summed over all possible values of m 1 , m 2 . . . m n−1 . Unfortunately, a closed form expression can only be obtained in the continuum limit, where all sums turn into integrals.
To carry out the needed approximation, assume that S is obtained by sampling a stationary signal at regular intervals of duration δt. Furthermore, let t k def = m k δt and t def = lδt be the time of occurrence of the k'th record in the series, and the total observation time, respectively. Noting that m k = t k /δt, and taking the limit δt → 0, we find As the integrand in the above expression is a symmetric function of the arguments t 1 , . . . , t n−1 , any permutation of the arguments does not change the integral. Summing over all permutations of the order n − 1 variables and dividing by (n − 1)! leaves the expression unchanged and yields In the last expression, which can be recognized as a Poisson distribution with expectation value µ n (t) = log(t), the convention that the first entry is always a record is abandoned for convenience. As well known, the variance of the process equals its mean, σ 2 (t) = µ n (t).
Clearly, the number of records falling between times t w and t > t w is the difference of two Poisson process, and hence itself a Poisson process with expectation µ n (t w , t) = ln(t) − ln(t w ) = ln(t/t w ). The average number of records per unit of time decays as Importantly, the logarithmic rate of events is constant, and the transformation t → ln t renders the series R memoryless and translationally invariant. A Poisson process is uniquely characterized by the exponential PDF of the waiting time to the next event. In our case, Eq. (8) implies that ln t plays the role of time variable, which brings our focus on the logarithmic waiting time between successive events, the k'th waiting time The ∆ t k are independent of k and exponentially distributed with unit average. Their exponential PDF is thus where k has been dropped from the notation. Finally, since is a sum of k independent exponential variables with unit average, it follows that τ k is Gamma distributed, with density given by For large k, the Gamma distribution can be approximated by a Gaussian, and the waiting time t k is therefore approximately log-normal. redTo justify the mathematical approximations behind Eqs. (8), i.e. mainly replacing sums with integrals, we consider 500 series each comprising 50000 independent Gaussian deviates with zero mean and unit variance. To create a time variable, time intervals between the adjacent elements of each series are drawn from a uniform distribution in the unit interval. The actual record subsequences of each stationary time series are extracted and analyzed statistically. Figure 3 shows that the log waiting times are exponentially distributed, as claimed. The insert shows that the average and variance of the number of records are nearly proportional to the logarithm of time, in fair agreement with Eq. (9).
To conclude, the properties of record statistics of noisy data are expectedly quite general, as the the noise statistics does not enter, with exception of the independence of successive noise events.

B. Quake statistics in real data
In applications, the time series S will often be a physical signal describing pseudo equilibrium fluctuations, e.g. red The statistics describes records in white noise time series, constructed from independent Gaussian deviates, as explained in the main text. energy fluctuations at constant temperature. The subsequence R is not directly available as record signal, but is extracted from S by heuristically identifying the anomalous events, or quakes, leading from one pseudoequilibrium state to the next. A statistical analysis is then used to ascertain if R behaves as the record signal of S.
The analysis must consider the spatial structure from which the signal originates. A spatially extended system with a finite correlation length can be treated as composed of M dynamically independent domains, with M proportional to the system's size. If quakes within each domain are a record signal, their average and variance will both grow as M ln t. Correspondingly, the logarithmic rate of quakes will be r lt = M . In a time series comprising all quakes, the log-waiting times between successive events will mix inputs from different domains and their PDF will not be given by F ∆ln t (x) = exp(−M x). Equation (11) is only valid when the log-waiting times ln t k − ln t k−1 are constructed from quakes occurring in the same domain. If this spatial distinction is neglected, the time differences t k − t k−1 rather than the logarithmic time differences ln t k /t k−1 will be exponentially distributed. This property allows one to estimate the domain size as the largest spatial domain from whose dynamical data comply with Eq.(11), see e.g. [51].

C.
Power laws from RD Power laws express an underlying scale invariance and are naturally connected to critical behavior. In some cases, however, scale invariance is not associated to length scales in real space, but to free energy scales in configuration space. To be concrete, thermal relaxation in a self-similar energy landscape with valleys within valleys, as in Fig. 2, on all scales is described by power laws [5][6][7] over a range of temperatures extending from zero up to a critical value [52].
RD is a coarse-grained description of relaxation in a hierarchy, and produces power-laws with no connection to criticality. The details can differ, but stem in all applications from log-time homogeneity. Since we will show below that RD is a log-Poisson process, we consider its effects on different observables without further ado here. The first example, further discussed in IV A 3, is the survival probability P (t|t w ) that a species extant at time t w is still extant at t > t w , where times are measured in a suitable unit. A natural assumption is that each quake impinging on the system reduces the survival probability by the same factor, i.e. as a function of the number n of quakes, with 0 < x < 1. In order to extract a time dependence, the expression must be averaged over the Poisson distribution of the number of quakes falling in the observation interval (t w , t). With the constant logarithmic quaking rate coefficient denoted by r q , this yields This is an example of pure aging and of a power-law whose exponent λ = −r q (1 − x) is unrelated to critical behavior.
Consider now the coarse-grained autocorrelation function of, say, the energy, between times t w and t. Because of log-time homogeneity, it has the generic form [37] where λ k < 0 are the eigenvalues of the (unknown) master equation describing the process and w k > 0 are the corresponding positive spectral weights.

A. Punctuated Equilibrium and Ecosystem dynamics
Gould and Eldredge first suggested [53] and then vigorously argued [54,55] that the abrupt changes observed in the fossil record, i.e. mass extinctions, are a significant mode of evolution, rather than the outcome of random fluctuations. Their thesis, dubbed Punctuated Equilibrium, is reconciled with Darwinian phyletic gradualism in Stephen J. Gould's opus magnum [56], The Structure of Evolutionary Theory.
Raup and Sepkoski [57] analyzed mass extinctions in the marine fossile record and proposed that the apparent decline in their rate since the Cambrian explosion was the consequence of an on-going optimization process. This idea was later taken up in [49,58], where extinction events were described in terms of jumps in a rugged fitness landscape, and the decay was linked to record statistics.
Newman and Eble [59] showed that the decline in the extinction rate during the Phanerozoic is accurately described by a logarithmic increase of the cumulated number of extinct families vs. time, measured since the Vendian-Cambrian boundary at about 544 Ma. This observation, we now belatedly realize, fully concurs with an RD description of the evolution process. Newman and Eble's analysis is discussed in more detail in the next section.
Punctuated Equilibrium has also been interpreted [60,61] as a manifestation of Self-Organised Criticality, originally a stationary paradigm for complex behavior of slowly driven systems [62]. Along this line, Bak and Sneppen [60] use a simple and elegant model, where a set of "species" comprises an array of random numbers between zero and one, each considered to be a fitness measure. Each species interacts with neighbors, say two neighbors for simplicity. The dynamics iterates two steps: a removal of the species with the lowest fitness and its two neighbours, followed by the replacement of the removed, i.e. "extinct ", species with three new randomly drawn numbers. The removal of the two neighbors represents the interdependence between co-evolving species. Note that the number of species remains constant in time.
The model soon enters a state in which the fitness values of extant species are mainly concentrated above a certain threshold value. Every so often, a newly introduced species will possess a fitness below this threshold and a burst of extinctions occurs until all fitness values have returned above the threshold. The number of extinction and replacement events needed to return the system above threshold has a broad distribution and in this sense the model exhibits features of SOC.
Whether the Bak-Sneppen model exhibits Punctuated Equilibrium or not is in the eyes of the beholder. The extinction and renewal activity is always occurring at the same rate, namely one, the species of lowest fitness, plus the assumed number of its neighbours, which is the same for all species. Bak and Sneppen, however, argued that the real. evolutionary time scale is not given directly by the updates, but is an exponential function of how far below the threshold a given lowest fitness value is. As the update activity is considered on this supposedly true macroscopic time, punctuation is introduced because of the assumed exponential blow up. It is perhaps not entirely satisfactory that the main feature the model is supposed to reproduce hinges on an exponential relation to blow up the waiting time between abrupt events. Such a relationship is seen in thermal activation over energy barriers, although being far below the threshold should produce a faster rather than slower return to average behavior.
In the following we discuss evolution processes with very different time scales involving the fossil record, bacterial cultures in a flask, a model of ecosystem evolution and the sequence of exit times of ants from their nest. In the last case, both observational data and a model linking them to spatial rearrangements of interacting ants are discussed.

Macroevolution and the fossil record
The analysis by Newman and Eble [59] of marine families extinction, on which this section is based and from which Fig. 4 is taken, argues that species extinction is a decelerating process, and suggest a date at which this process began, i.e. 260 My before the Cambrian explosion. We include these important results together with equally important data from bacterial evolution [63,64], to highlight the similarity of evolutionary dynamics on very different time scales and to suggest an explanation in terms of RD. The original work should be consulted for more details. Figure 4 shows the cumulated number of extinctions of marine families, as a function of time. The cumulative extinction appears as a straight line when plotted on loglinear axes. Note that the time axis is inverted. The fit provided is of the form where the 'initial time' s given as t 0 = −262 My. The insert shows that a log-log plot does not lend itself to a linear fit, and that a power-law description is unsuitable.
The fossil record will never provide decades of data on a scale of million of years. In spite of this obvious limitation, the quality of the fit is confirmed by the presence of small logarithmic oscillations, and the conclusion that macro evolution is not a stationary process, a property of SOC type models [60,61], seems inescapable. Without discussing specific dynamical models, we note that extinction events are macroscopic irreversible changes whose cumulated number grows logarithmically in time. Both these properties characterize quakes in RD and are found in the Tangled Nature Model described below.

2.
Bacterial populations in a controlled environment Lenski and Travisano [63] and later Weiser et al., [64] measured the Malthusian fitness of 12 E. coli populations grown in a constant physical environment over 10000 and 50,000 generations, respectively. A generation begins when a bacterial sample is injected in fresh substrate and ends when its population reaches its first plateau. In the experiments, 12 initially identical cultures follow different evolutionary paths, all leading to increasing cell size and increasing Malthusian fitness, i.e. initial rate of growth of the population. In the lack of competition with other organisms, all evolutionary changes must concern the metabolic pathways needed to process the substrate. The situation can be modelled by an adaptive walk on a rugged fitness landscape [65] where any improvement, i.e. a fitness record, quickly moves the population to a higher fitness peak [66], i.e. induces the macroscopic change we presently call a quake. Assuming that each quake leads to the same improvement, RD predicts a logarithmic Malthusian fitness increase of the bacterial populations. Figure 5 depicts the Malthusian fitness of the 12 populations as dots and their average as circles. The logarithmic trend of the data is clear. It is equally clear that a logarithmic trend cannot continue indefinitely and that the growth law f = f 0 (1 − exp(−αt)), where the exponent α is a positive number close to zero, grows as a logarithm for t ≪ 1/α and remains bounded for t → ∞.
In summary, the Malthusian fitness of E. choli evolving in a fixed environment undergoes a decelerating process similar in certain respects to evolution on much larges scales. The situation is well described by the Kauffman model [65], where the fitness of an individual is fully determined by its genome. This is the opposite of the Tangled Nature Model, described below, where fitness is determined by a tangle of interactions connecting individuals of different species. Interestingly, both models feature logarithmic growth laws.

A model of ecosystem dynamics
The agent based Tangled Nature Model [67,68] captures key elements of ecosystem evolution, including punctuated equilibrium [67] and a power-law distributed life-time of species [69]. Over the last two decades a whole family of Tangled Nature models appeared, a development recently described in [70], where applications to different fields of science are reviewed.
Closely related to our current RD focus, the model features qualitative structural changes or quakes which can be interpreted as extinctions. To a good approximation these are uniformly distributed on a logarithmic time axis [71], i.e. the logarithmic rate of quakes is constant, as predicted by RD. Correspondingly, the integrated number of quakes grows with the logarithm of time, which concurs with the behavior extracted from the fossil record [59]. Punctuations, i.e. quakes, irreversibly disrupt quasi-Evolutionary Stable Strategies (qESS), periods of metastability where population and the number of extant species, or diversity, fluctuate reversibly around fixed values.
The basic model variables are binary strings of length L, i.e. points of the L dimensional hypercube. Variously called species or sites, these are populated by agents or individuals, which reproduce asexually in a way occasionally affected by random mutations. Only a tiny fraction of the possible sites ever becomes populated during simulations lasting up to one million generations. The extant species, i.e. those with non-zero populations at a given time, are collectively referred to as ecosystem, and their number as diversity. Unlike e.g. the Kauffman model [65], an agent's reproductive success is not predicated on its genome, but the interactions, or couplings, it has with other agents.
How agent b affects the reproductive ability of agent a and vice-versa only depends on the species to which these agents belong. The interaction is non-zero with probability θ, in which case the couplings (J ab , J ba ) are extracted, once and for all, from a distribution well described by the Laplace double exponential density p(x) = 1 2a e −|x−x|/a . For the data shown, the parameters x and a are estimated to −0.0019 and 0.0111, respectively. Note that the long term dynamics of the model strongly depends on the coupling distribution having finite or infinite support See Refs. [69,71] for further details.
The set of interactions linking an individual to others is key to its reproductive success and arguably constitutes its most important property. Yet, in many studies, e.g. [71][72][73], the interactions of an individual and those of its mutated off-spring are unrelated, a rather unrealistic feature corresponding to a point mutation turning a giraffe into an elephant.
This issue has been addressed [74,75] by introducing correlated interactions between parent and off-spring. More recently, a family of models was introduced [69] , parametrized by a positive integer K, where K = 1 is the original version, with uncorrelated interactions, and where the degree of correlation grows with K. The main effect of correlated interaction is seen in the species survival probability, described in Fig.8.
Let S, N b (t) and N denote the ecosystem, the population size of species b, and the total population N (t) = b N b (t). An individual of type a is chosen for reproduction with probability n a = N a /N , and succeeds with probability p off (a) = 1/(1 + e −Ha ), where and where is a density weighted coupling. In Eq. (18), µ is a positive constant which represents the finiteness of resource availability curbing population growth. Letting p mut be the mutation probability per bit, parent and offspring differ by k bits with probability Bin(k; K, p mut ), the binomial distribution. Death occurs with probability p kill and time is given in generations, each equal to the number of updates needed for all extant individuals to die. Thus, with population N at the end of the preceding generation, the upcoming generation comprises N p kill updates. The parameters used are always L = 20, µ = 0.10, θ = 0.25, p kill = 0.20, p mut = 0.01, and the initial condition invariably consists of a single species populated with 500 individuals. The concepts of core and cloud species are key to define quakes and understand the entropic mechanism causing the increasing duration of the qESS. A core is a group of species linked by mutualistic interactions. As a consequence, the majority of the agents resides in the core. The cloud consists of many species, each having few individuals, often mutants of core species.
A quake is an event which rapidly changes the composition of the core, e.g. by replacing some or all its component species. The event is initiated by a new mutant cloud species which receives strong positive interactions from the core. As a consequence, it enjoys great reproductive success and its growing population destabilizes the core via the global negative feed-back associated to the µ term of Eq. (18).
The growing duration of qESS reflects an entrenchment into metastable configuration space components of increasing entropy: It becomes increasingly difficult to generate by mutations a new cloud species able to threaten the core. The step requires an increasing number of mutations as the system ages.
In the main panel of Fig. 7, the logarithmic waiting times log(t quake /t w ) for ouakes falling after time t w , averaged over an ensemble of 2022 repetitions, are plotted vs. t w with 1σ error bars. These average log-waiting times are seen to be nearly independent of t w , which indicates that quake dynamics is log-time homogeneous, as RD posits. The insert shows the average Hamming distance between cloud species and the most populous core species. The distance grows as the logarithm of time, which implies that core mutants become on average gradually less fit, and . more mutations are needed to generate a viable mutant which can destabilize the core. Agents die at a constant rate, and have a finite expected life-time. In contrast, as we shall see, species go extinct at a decelerating rate, and their life-times do not possess finite averages. A cohort is defined as the set of species extant at time t w and their persistence P (t w , t) as the fraction of the cohort still extant at time t > t w . Persistence provides an estimate of the probability that a species extant at time t w still is extant at later times. The distinction between the two quantities is glossed over in the following. The life-time probability density function of a species extant at t w is then Figure 8, taken from Ref. [69] shows that persistence data obtained for different t w collapse when plotted as function of t/t w , a so called pure aging behavior. The two data sets describe models with uncorrelated and correlated interactions between parent and off-spring (K = 1 and K = 5, respectively.) As expected, inheritance of interactions leads to higher persistence. The lines are fits to a power-law y(t/t w ) = a(t/t w ) b , For K = 1 the exponent is b = −0.283 (14) and for K = 5 it is b = −0.117 (6). The behavior is as predicted by RD, see e.g. Eq.(15) Three comments are in order: first, independently of the degree of inheritance, Eq. (20) shows that the life-time distribution lacks a finite average. Second, we see that species created at a late stage of the evolution process (large t w ) are more resilient than those created early on, implying that the rate of quakes decreases in time. Third, the exponent of the persistence decay is more than halved when K goes from 1 to 5, clearly showing that inheritance produces a more robust ecology where species live longer.

Movement in ant colonies
That biological systems comprising many interacting agents evolve through punctuations is not narrowly dependent on the type of interaction. Social interactions in ant colonies of Temnothorax albipennis seem to generate the non-stationary ant motion behavior studied by Richardson et al. [76] and described by an RD analysis of statistical data. Our fig. 9 is taken from this reference, which should be consulted for further details. Importantly, the analysis of [76] is based on observational data and hence independent of modeling assumptions. A theoretical model of the same phenomenology [77] is also discussed below.
In the experiment, fifteen T. albipennis colonies were housed in cardboard nests kept at a constant temperature (24 • C), with continuous lighting. All colonies hosted a queen and had a complement of brood at various stages of development. An ant's first exit from the nest was recorded as an event and to ensure that only first exits were counted the ant was either removed or allowed to return after all ants were initially tagged and the tag detected upon exit. The PDF of the logarithmic waiting time between successive exits is exponential, as expected in RD and the cumulated number of exits grows proportionate to the logarithm of time. This behavior is shown in Fig. 9 for a number of different measurements. In the lower right corner of each panel, NR stands for "Non Removal and R for "Removal", which refers to the fate of the ants exiting the nest.
All in all, a RD description of ant movement in T. albipennis seems reasonably accurate. This conclusion was challenged by Nouvellet et al. [78], who performed new experiments where the exit process could more simply be described by standard Poisson statistics, the result expected if the ants move randomly and independently in and out of their nest. The question is of course whether the two sets of experiments really were equivalent [79]. More important from our vantage point is the mechanism producing the gradual entrenchment of dynamical trajectories in more long-lived metastable configurations.
In the model proposed in [77], from which Figs.10 and 11 are taken, ants of different types move in a stochastic fashion from one site to a neighboring site on a finite 2D lattice, similar to an ant's nest. Each site corresponds to a small area which can be occupied by several ants, and one site is designed as exit. The probability of each move is determined by the changes of a utility function E it entails. E is a sum of pairwise interactions between ants, weighted by distance. Depending on type, ants can have positive or negative interactions, meaning that moving closer induces a positive or negative change δE in the utility function. A standard Metropolis update algorithm is used [80] with the probability of carrying out a move given by min[exp((δE/T ), 1], where the parameter T , called 'degree of stochasticity' (DS) controls the importance of the interactions, similarly to the temperature in a physical model. The sign inversion relative to the usual Metropolis convention implies that movements increasing the utility function are unconditionally accepted.
Since interactions are negligible in the limit T → ∞, the ants act independently and their movement out of the nest are a Poisson process in linear time, as observed by Nouvellet et al. [78]. At low T , the systems enters a non-stationary regime, where exits can be described as suggested by Richardson et al. [76]. In model simulations starting from a random distribution of ants in the nest, ever larger ant clusters establish themselves on gradually fewer sites. This in turn creates continually growing dy- namical barriers (e.g. empty sites) for ants which have not yet joined a cluster. Whether a similar mechanism involving social structures is at play in real ant nests remains of course to be seen. Let us finally note that the size of the barrier scaled during ant movement was not investigated in [77]. We are therefore unable to give a precise definition of quake. and use nest exits as a proxy.
All simulational data presented in Fig. 10 pertain to a system with four types of ants. The value of the utility function per ant, averaged over 100 trajectories is plotted vs time (i.e. number of MC sweeps) on a logarithmic scale. For T = 500 and 200, a constant value is approached. The latter increases as the DS decreases, since the system's ability to maximize its utility function initially improves when lowering T . However, for T = 5 and T = 10 the average E only grows logarithmically without approaching equilibrium. In the two low T curves, the mean value of the utility function is seen to increase with T rather than decreasing as in the stationary regime. This highlights the strong non-equilibrium nature of the relaxation process. Figure 11 describes the exit statistics, (corresponding to quake statistics) in a system with two ant types. At the k' th sweep the program checks whether motion has occurred at the site dubbed 'exit' and, if so, registers the corresponding time t k . The simulations, each running from t = 5 to t = 5 · (1 + 10 5 ), are repeated 100 times in order to improve the statistics. The DS values used in the simulation are T = 50 and T = 5 in the left and right panel, respectively. The data shown are statistically very different in spite of being graphically rather similar. In the left panel the waiting times, i.e. the time differences ∆ k = t k − t k−1 are analyzed with respect to their correlation and their distribution. Since the t k ' s are integer rather than real numbers, the exit process can never be truly Poissonian. We nevertheless estimate the normalized correlation function of the ∆ k 's, averaged over 100 independent runs. For independent entries, the latter would equal the Kronecker's delta C ∆ (k) = δ k,0 . We furthermore estimate the probability that ∆ > x, as a function of x. For a Poisson process this probability decays exponentially in x. The correlation and probability distribution are plotted in the main figure and the insert, using a linear and a logarithmic ordinate, respectively. We see that short waiting times, i.e. ∆ k 's of order one are over-represented relative to the straight line representing the Poisson case. Secondly, the correlation  Figure 11. Left: The correlation function C∆(k) for the waiting times between consecutive exit events, plotted vs. k. The insert shows the cumulative distribution of the waiting times, plotted on a logarithmic vertical scale. The system contains two types of ants moving on a grid of linear size 7 with DS parameter is T = 50. Right: same as above, except that the DS value is here T = 5 and that the correlation and cumulative distribution are calculated using the log-waiting times rather than the waiting times. Figure taken from [77].
decays to about 1/10 in a single step, but then lingers at that value. Taken together, these two feature indicate that a short waiting time is more likely followed by another short waiting time, i.e. that the motion often stretches over several sweeps.
The right panel of the figure shows data obtained as just discussed, except that logarithmic time differences τ k = ln(t k ) − ln(t k−1 ) rather than linear ones are utilized. The correlation function C τ (k) decays quickly to zero, albeit not in a single step, and the probability that τ > x is nearly exponential. Again, short log-waiting times are over-represented in the distribution, and since the correlation decays to near zero in k = 5, they are likely to follow each other. Thus, also in this case ant at the 'exit' stretches beyond a single sweep. In summary, banning the effect of our time unit, the sweep, being too short relative to the de-correlation time of ant motion" T = 50 data are, as expected in a stationary regime, well described by a Poisson process, while T = 5 data are well described by a log-Poisson process.
To conclude, the experiments of Richardson et al. [76] showed that the nest exit statistics of Temnothorax albipennis are well described by RD. This behavior requires a hierarchy of dynamical barriers, expectedly due to ant interactions whose details are unknown. A simple agent based model of interacting ants has a 'glassy' phase where the interaction between the agents are important and the experimental findings are reproduced. When the agents' motion becomes sufficiently random, the system equilibrates, and the RD behavior disappears. The model hints at a possible origin of the barrier hierarchy required by RD, but the issue needs further work for clarification, as does the biological relevance and possible mechanism controlling the degree of stochasticity of real ants.

B.
High Tc superconductivity. The ROM model This section deals with the motion of magnetic flux lines inside type II superconductors. We mainly follow Ref. [81], which describes simulations of the Restricted Occupancy Model (ROM). This reference, which should be consulted for further details, presents an early application of RD ideas and uses a notation which differs from ours in the rest of this paper. In particular, t is not the time elapsed since an initial quench, but an observation time. For the reader's convenience we follow in this section the notation of Ref. [81]. Type II superconductors are layered materials which remain superconducting even after an applied magnetic field H begins to enter the sample at a field strength H > H c1 . When H exceeds a second value H c2 > H c1 the material becomes a normal conductor. In the range H c1 < H < H c2 , the external magnetic field penetrates into the bulk of the superconductor by forming vortex or flux lines created by circulating currents, or vortices, of superconducting electrons. Parallel flux lines have a repulsive interaction energy that depends on their distance, and pushes them towards the bulk of the material. Inhomogeneities tend to trap the flux lines and impede the rapid establishment of a uniform flux density. A mechanical force balance is established between the pinning force exerted by an inhomogeneity, also called a pinning center, and the force produced by the gradient in the surrounding flux line density. A pinning center corresponds to a local potential energy well from which thermal fluctuations can release a flux line and allow it to move towards regions of lower flux density, kicking as a result other flux lines out of their traps. In this way, flux avalanches lower the gradient of flux line density as flux lines move into the bulk region of the material. As a consequence, the difference between the internal magnetic field and the external applied field decreases in a relaxation process towards thermodynamic equilibrium. red The ROM model introduces a discrete grid for each superconducting layer and uses the number n i of flux lines transversing the i'th plaquette of the grid as dynamical variables. The only interactions included are those between flux lines in the same or nearest neighbour plaquettes, and the motion of the flux lines is represented by a time variation of their numbers in each plaquettes. The energy of a flux line configuration {n i } is given by The first term represents the repulsion energy due to vortex-vortex interaction within a layer, and the second the vortex self energy. As already mentioned, interactions beyond nearest neighbors are neglected. We set A 0 def = A ii = 1, A 1 def = A ij if i and j are nearest neighbors on the same layer, and A ij = 0 otherwise. The third term represents the interaction of the vortex pancakes with the pinning centers. A p is a random potential and we assume for simplicity that A p has the following distribution P ( The pinning strength |A p 0 | represents the total action of the pinning centers located on a site. In the simulations described here A p 0 = −0.3. Finally the last term in Eq. (21) describes the interactions between the vortex sections in different layers. This term is a nearest neighbor quadratic interaction along the z axis, so that the number of vortices in neighboring cells along the z direction tends to be the same.
The parameters of the model are defined in units of A 0 . The time is measured in units of Monte Carlo (MC) sweeps. We assume the external magnetic field to be applied perpendicularly to the planes and for this reason we only consider motion of the vortex pancakes within the planes and use periodic boundary conditions in the zdirection. Each individual Monte Carlo update consists in selecting a vortex pancake at random and moving it to a randomly selected neighbour position. As always in Metropolis importance sampling the movement of the vortex is automatically accepted if the energy of the system decreases or remain unchanged. If the energy of the system increases, the movement is accepted with probability exp(−∆E/T ).
The low temperature dynamical evolution of the response to a magnetic field quickly ramped up to a constant value at t = 0 is shown in Fig. 12, taken from [81]. The total number N (t) of vortices, which is plotted versus the logarithm of time, mainly changes through nearly vertical steps which punctuate equilibrium-like plateau values. The overwhelming majority of steps leads irreversibly to a higher number of vortices and we hence identify them with our quakes.
The quake size is the number v of vortices it allows to enter the system and its Probability Density Function of v is shown in the insert of Fig. 13 for several different temperatures and three different observation times, all starting at t w = 1000 MC sweeps. Statistical insight into the time evolution in the number of vortices present within the system is provided by Fig. 13, where the empirical distributions of N (t) are displayed for three different times, which are equidistantly placed on a logarithmic time axis. The insert in Fig. 13 shows the tail of the probability density function of the number of vortices, p(v), entering during a single quake. To a good approximation, the tail is exponential and the time and temperature dependence of of p(v) are negligible, except for the highest temperature T = 0.5. From Eq.(8), the probability that exactly q quakes occur during the time interval [t w , t w + t] is log-Poisson distributed according to  where α is the logarithmic quaking rate.
We approximate the PDF for the number of vortices v which enter during a given quake (see insert Fig. 13) by an exponential distribution p(v) = exp(−v/v)/v, and assume that consecutive quakes are statistically independent. The number of vortices entering during exactly q quakes is then a sum of exponentially distributed. independent variables, and is hence Gamma distributed. Finally the PDF of the total number of vortices entering during [t w , t w + t] by averaging the Gamma distribution for q quakes over Eq. (22). This leads to the following expression for the PDF of the total number of vortices ∆N = N (t+t w )−N (t w ) entering during the time interval where I 1 denotes the modified Bessel function of order 1. To estimate q , α = 22.6 is used for the logarithmic quaking rate. The averagev either obtained from the distributions in the insert of Fig. 13 or from fitting Eq. (23) to the simulated data in the main frame of the same figure is estimated to bev = 16. We note thatv is essentially temperature independent for temperatures below T ≈ 0.1. The probability density function, P (N (t + t w ) − N (t w )) is plotted as a smooth line for t w = 1000 and three different observation times t = 188, 2791, 8371 in the main panel of Fig. 13, while the jagged line depicts the simulation results.The agreement is reasonable, considering the numerous approximations entering the derivation.
If we imagine that individual flux lines are trapped into local energy wells and escape them as a result of thermal fluctuations over a single energy barrier ∆E, the rate of the process would contain an Arrhenius factor exp(−∆E/(k b T )) and feature a strong dependence on the temperature T . This naive scenario is disproven by experimental observations showing that the magnetic relaxation rate, or creep rate, only varies weakly in a broad range of temperature, see [82].
To study ROM behavior in this respect, the external field is rapidly increased at t = 0 to a constant value. The time dependence of the total number of vortices in the sample is then studied for several values of the temperature T . The procedure is repeated for many realizations in order to estimate an average vortex bulk density n(t). As shown in Fig. 14, n(t) gradually increases as vortices move in from the boundary and the value of T has little effect except at the highest temperatures. Finally, in the region 3 10 2 < t < 10 4 n(t) depends linearly on the logarithm of time.
The near independence of the creep rate on the temperature suggests the presence of a continuum of barriers such that a record high energy fluctuation is enough to trigger a quake. The behavior than follows from the independence of record statistics on the thermal noise signal distribution. red Data obtained simulating the ROM model briefly described in the main text. Figure taken from Ref. [81].

C. Spin glasses
A general property of aging systems first noticed in spin-glasses is that short time probes, e.g., the imposition of a high frequency AC field under cooling at a constant rate, elicit a (pseudo)equilibrium response, while long time probes, the same external field applied at constant temperature, reveal the true non-equilibrium nature of the relaxation process. Interestingly, 'short' and 'long' must be understood relative to the time elapsed since the initial thermal quench, i.e., the system age t w . This property alone suffices to establish the presence of a hierarchy of dynamical barriers in configuration space.
The experiments of Jonason et al [83], whose main result is shown in Fig. 15, clearly demonstrate the hierarchical barrier structures present in spin glass configuration space. In the figure, the imaginary part χ" of the AC susceptibility of a spin glass is plotted as a function of temperature. Importantly, the system is first cooled at a constant rate, except for a pause at temperature T = 12K, after which cooling is resumed. Once the lowest temperature is reached, re-heating at a constant rate, without pausing at T = 12K is carried out. The reference curve, which results from cooling at a constant rate, is the equilibrium result expected when the frequency of the applied magnetic field is sufficiently high. The dip shows that aging entrenches the system into gradually more stable configurations, with an ensuing decrease of the susceptibility. When cooling is resumed, the system rapidly returns to the reference curve, a so-called 'rejuvenation effect'. This indicates that the 'subvalleys' explored at lower temperatures, are all dynamically equivalent, irrespective of the aging process. Finally, upon reheating without pause, the system remembers the configuration space region previously explored. This 'memory effect' confirms the hierarchical nature of the energy landscape.
Thermal relaxation models associate the multi-scaled nature of aging processes to a hierarchy of metastable components of configuration space [6,7,25,84], often described as nested 'valleys' of an energy landscape as illustrated in Fig. 2. Local thermal equilibration is described in terms of time dependent valley occupation probabilities [27], which are controlled by transition rates over the available 'passes'. When applied to a hierarchical structure, such description gradually coarsens over time as valleys of increasing size reach equilibrium. That barrier crossings are connected to record values in time series of sampled energies [85,86] is a central point in record dynamics.
In connection with spin-glasses, RD has predictions describing Thermo-Remanent Magnetization (TRM) data [87] where quakes are extracted from experimental data as anomalous magnetic fluctuations. The results described below are based on more recent numerical work [88] to which we refer for further details, and which describes the spontaneous magnetic fluctuations of the Edwards-Anderson model [89]. As detailed in [88], quakes are linked to record sized entries in the time series of observed energy values in isothermal simulations. We note in passing that this approach requires the use of an event driven Monte Carlo algorithm, the Waiting Time Method [85], where event times are real numbers on scales much finer than a MC sweep. Figure. 16 shows, on a logarithmic ordinate, the PDFs of energy changes ∆ measured over time intervals of two different types and scaled according to the temperature  [88] at which the simulation is run. The lower curve, which is well fitted by a Gaussian of zero mean, describes energy changes over short time intervals of equal length. The upper curve, which feature an exponential decay on its left wing, describes energy fluctuations observed from quake to quake, i.e. over time intervals of varying length. Results from isothermal simulations carried out at seven different temperatures, T = .3, .4, . . . .7, .75 and .8 are collapsed by the scaling ∆ → T −α ∆, α = 1.75. The scaling form reflects entropic effects linked to the density of states near local energy minima.
Consider now the times of occurrence t ′ and t of two successive quakes, t > t ′ , and form the logarithmic time difference ∆ln = ln(t) − ln(t ′ ) = ln(t/t ′ ) > 0, called for short, log waiting time. If quaking is a Poisson process in logarithmic time, the corresponding PDF, F ∆ln (x) is given theoretically by where r q is the constant logarithmic quaking rate. The applicability of equation (24) has already been tested in a number of different systems, including spin-glasses [90]. The upper panel of Fig. 17 shows the empirical PDFs of our logarithmic waiting times, sampled at different temperatures and collapsed through the same scaling as above, ∆ln → T −α ∆ln. The resulting PDF is fitted by the expression F T −α ∆ln (x) = .81e −1.57x , which covers two decades of decay. Its mismatch with the correctly normalized expression (24) stems from the systematic deviations from an exponential decay visible for small x values. These deviations arise in turn from quakes which occur in rapid succession in distant parts of the system. These are uncorrelated and their statistics is not captured by the tranformation t → ln t. The insert shows that log-waiting times are to a good approximation δcorrelated.
Neglecting temporally very close events usually originating from well separated parts of the system, leads to the corrected number of quakes n q (t) occurring up to time t which is shown in the bottom panel of the figure for seven different aging temperatures. The steepest curve corresponds to the lowest temperature. The red dotted lines are linear fits of n q (t) vs. ln t, and the insert shows that the logarithmic slope of the curves is well described by the function r q = 1.11T −1. 75 . We note that the logarithmic quake rate as obtained from the exponent (not the pre-factor) of the fit y(x) = .81e −1.57x is r q = 1.57T −1.75 . The two procedures followed to determine the quaking rate are thus mathematically but not numerically equivalent: in the time domain they give the same T −1.75 /t dependence of the quaking rate, but with two different pre-factors. The procedure using the PDF of the logarithmic waiting times seems preferable, due to better statistics.
Glossing over procedural difference, we write r q = cT −1.75 where c is a constant, and note that in our RD description the number of quakes occurring in the interval [0, t) is then a Poisson process with average µ N (t) = cT −α ln(t). Qualitatively, we see that lowering the temperature decreases the log-waiting times and correspondingly increases the quaking rate. The quakes involve, however, much smaller energy differences at lower temperatures. Considering that T −α ≫ T −1 , we see that the strongest dynamical constraints are not provided by energetic barriers. As discussed in Ref. [88], they are entropic in nature and stem from the dearth of available low energy states close to local energy minima.
Finally, our numerical evidence fully confirms the idea that quaking in the Edward-Anderson model is a Poisson process whose average is proportional to the logarithm of time. In other words, the transformation t → ln t renders the aging dynamics (log) time homogeneous and permits a greatly simplified mathematical description.

D.
Hard sphere colloidal suspensions Hard sphere colloids suspensions (HSC) are a paradigmatic and intensively investigated complex system [93][94][95][96][97][98][99] with two dynamical regimes controlled by the particle volume fraction [99]: below and above a critical volume fraction one finds a time homogeneous diffusive regime and an aging regime, respectively. Here, the particle mean square displacement (MSD) grows at a decelerating rate through all experimentally accessible time scales.
The first RD description of HSC dynamics [100] reanalyzed available experimental data [95], pointing out that the particles' mean squared displacement (MSD) grows logarithmically with time. Furthermore, a heuristic model explaining the experimental observations was proposed and later extensively investigated numerically [101]. More recently Robe et al. [28] highlighted the hyperbolic decay of the rate of 'intermittent' events in 2D colloidal suspensions experiments by Yunker et al. [38], shown in Fig. 18(a).
The present exposition gives an overview of a recent numerical studies [51,91], to be consulted for technical   [38] on 2d-colloidal suspensions at several area fractions, re-analyzed in Ref. [28]. Panels of the same row contain the exact same data, however, each panel in the left column of each block is plotted against the conventional lag-time ∆t = t − tw, and against the time-ratio t/tw in the right column on the log-scale suggested by RD. The experimental data are arranged by decreasing area fractions, with ≈ 84%, ≈ 82%, and ≈ 81%, from top to bottom. The inset to the bottom left panel contains the same MSD data as that panel, but plotted on the log-log scale for ordinary diffusion; at 81%, the system is no longer jamming and full aging is avoided. Both, MSD and persistence, in the jamming regime (> 81%) of the experiments closely follow the RD predictions. details, where large 3D poly-disperse or 2D bi-disperse HSC undergo extensive MD simulations.
red To give an idea of the numerical effort involved, the MD calculations performed in [51] calculate the 3D trajectories of N = 50000 hard spheres of unit mass, interacting through the potential In the above, ǫ is the unit of energy (and temperature), r ij is the distance between particles i and j and σ i is a particle's diameter. The diameters are drawn from the uniform distribution in an interval centered at the mean particle diameter σ, which is taken as the unit of length. The unit of time, formally τ = σ √ (m/ǫ) is the time it takes an isolated particle to move its own diameter at its thermal speed.
The initial state is generated by a sudden expansion Data-Collapse Figure 20. Van Hove distribution of single particle displacements ∆x over a time-window ∆t starting at tw after quenching, from the MD simulation of a 2D colloid [91]. It spreads out more with increasing ∆t (a), but less with increasing tw (b). Data from (a-b) collapse for any fixed ratio of ∆t/tw (c), as predicted by RD.  Figure 21. For three different ages, tw = 2 · 10 4 , 8 · 10 4 and 8 · 10 5 , the PDF of ∆x, a one dimensional particle displacement sampled over a time interval ∆t ≪ tw, is plotted with a logarithmic ordinate. In both panels, the staggered line is a fit to a Gaussian of mean µG = 0 and standard deviation σG = 0.05σ, where σ is the average particle diameter. Left hand panel: the same time interval ∆t = 100τ is used for all three values of tw. Right hand panel: time intervals ∆t = 100, 400, 4000τ growing proportional to the system age are used. The volume fraction of this system is φ = 0.620. red Figure taken from [51] of the particles' volume, leading to volume fractions φ both below and above the critical value. The systems' development is subsequently followed for more than six decades in time. Two different types of particle motion are generally found in glass-formers [102]: 'in-cage rattlings', where each particle moves reversibly within a small region bounded by its neighbors, and 'cagebreakings', where 'long jumps' alter neighborhood relations. Since cage rattlings, overwhelmingly the most frequent events, on average do not produce net translations, particle spreading in HSC is facilitated by the much more rare cage breakings. We thus take those long jumps to be the spatial manifestation of quakes, and identify the latter from the statistics of single particle displacements. Finally, we check that quakes are a log-Poisson process.
The macroscopic effect of HSC quakes is the logarithmic growth of the particles' mean square displacement [28,51,100], closely resembling the experiments by Yunker et al. [38], exhibited in Fig. 19. The persistence data also shown there [28], i.e., the probability that a particle does not experience irreversible motion between times t w and t, is well described by the RD arguments for the emergence of power-laws in Sec. III C. The decay of persistence (or the intermediate scattering function [101]) gives a clear picture of the spatial heterogeneity of glassy systems: If particle motion were to follow a regular Poisson process, the persistence would decay exponentially in ∆t = t − t w , while in a log-Poisson process, persistence becomes an exponential in the number of activations, ∼ α ln (t/t w ) according to Eq. (9), leading That local averages are nearly independent of log-time is evidence that the transformation t → ln(t) renders the dynamics log-time homogeneous. b): the PDF of the 'log waiting times is estimated and plotted for two independent sets of simulational raw data, using yellow square and cyan diamond symbols, respectively. The line is an exponential fit to both estimated PDFs. The insert shows the normalized autocorrelation function of the sequence of logarithmic waiting times corresponding to the yellow square PDF. To a good approximation, the log-waiting times are uncorrelated and their PDF decays exponentially, which implies that quaking is a log-Poisson process. The volume fraction of the system is φ = 0.620. red Figure taken from [51] to a power-law decay in (t/t w ) α . The distribution of single particle displacements, aka, the self part of the Van Hove function, was investigated in both, Refs. [51,91], using the method introduced in Ref. [37] to describe heat transfer in the Edwards Anderson spin-glass. The probability density function (PDF) of displacements ∆x occurring over a short time interval ∆t (lag time) for given values of the system age t w is written as (25) which is normalized over all values of the dummy variable x. Specifically, G is sampled by collecting all positional changes x i (t w − ∆t) − x i (t w ) occurring at age t w over time intervals of duration ∆t, i.e. in the interval I(t w ) = (t w , t w + ∆t). To improve the statistics, spherical and reflection symmetry is used to i) merge the independent displacements in all orthogonal directions into a single file, representing a fictitious 'x' direction, and ii) to invert the sign of all negative displacements. Compatibly with the requirement ∆t ≪ t w needed to associate G with a definite age t w , a ∆t much larger than the mean time between collisions is preferable, as it accommodates many in-cage rattlings. Figure 21 depicts PDFs of single particle displacements in a 3D colloid of volume fraction φ = 0.620 [51]. Small displacements have an age independent Gaussian PDF, corresponding to the staggered line, while larger displacements strongly deviate from Gaussian behavior, as already seen in Fig. 6 of Ref. [98]. The displacements occur over short time intervals of length ∆t ≪ t w and are sampled in three longer observation intervals of the form [t, 1.1t] where t = 2 10 4 τ, t = 8 10 4 τ and t = 8 10 5 τ , where τ is the simulational time unit. Since the length of these intervals is only a tenth of the time at which observations commence, aging effects occurring during observation can be neglected and t can be identified with the system age, i.e. t w ≈ t. In panel a) of Fig. 21 a single value ∆t = 100τ was used for all data sampling, while in panel b) values proportional to the system age, ∆t = 100, 400 and 4000τ were used.
That the central part of G ∆x|tw,∆t is a Gaussian distribution with zero mean indicates that displacements of small length arise from many independent and randomly oriented contributions, which stem from multiple in-cage rattlings. The typical size of the cage can then be identified with the standard deviation of the Gaussian part of the PDF which is seen to be independent of age. The exponential tail is produced by cage-breakings, i.e. displacements well beyond the cage size. The weight of the non-Gaussian tail in panel a) of Fig. 21 is seen to decrease with increasing age while the length distribution of displacements of length exceeding 0.5σ G is shown in [51] to be exponential and age independent. Panel b) of Fig. 21 shows that scaling ∆t with the age t w reasonably collapses the data. The same effect is obtained in [91] for 2D colloidal suspensions and for other values of the ratio ∆t/t w , see Fig. 20. Summarizing, the particle displacement statistics provides a cage size estimate and a way to identify quakes as cage breakings.
The key step of quake identification in a given setting has some leeway, but in spatially extended systems spatio-temporal correlations play a major role: The existence of spatial domains, a property which reflects the strong spatial heterogeneity of glassy dynamics [103], is required in a RD description of HCS. Spatially extended aging systems of size N [104] contain an extensive number α(N ) ∝ N of equivalent spatial domains. Events occurring in different domains are statistically independent while those occurring in the same domain have long-lived temporal correlations. These are formally removed in RD by the transformation t → ln t. If this device works, the total number of quakes occurring in the system between times t w and t > t w is a Poisson process with average µ(t, t w ) = α(N )r q ln(t/t w ), where r q is the average logarithmic quake rate in each domain. The number of quakes is extensive and grows at a constant rate in log-time and at a rate proportional to 1/t in real time. Within each domain, the rate r q can be read off the log-waiting time PDF F ∆ln t (x) = r q e −rqx , i.e. the probability density that the log-waiting time to the next quake equals x. Usually, only a minuscule fraction of configuration space is explored during an aging process, and many variables do not participate in any quake. Hence, the domain size can grow in time with no changes in α(N ), which is best understood as the number of active domains where quake activity occurs. In Ref. [51], the simulation box is subdivided into 16 3 equal and adjacent sub-volumes, each containing, on average, slightly more than ten particles. The size is the largest possible yielding log-time homogeneous quake statistics. Let t k denote the time of occurrence of the k'th quake. Panel a) of Fig. 22 shows for the 3D colloid that the log-waiting times between successive quakes occurring within the same domain, ∆ ln t = ln t k − ln t k−1 , are uniformly distributed on a logarithmic time axis, while panel b) shows that i) they are uncorrelated, which we take as a proxy for independence, and ii) exponentially distributed. Together these properties imply that quaking is a log-Poisson process within each domain, and by extension in the whole system. We emphasize that subdividing the system into domain is crucial when calculating the log-waiting times. Similar data can be obtained for the 2D colloid, and even in other materials, as shown in Fig. 18(b). It demonstrates that, when too many events blur together (e.g., in a domain with too many particles n), it hides the local impact of the decelerating activated events and the exponential tails weaken. The data collapses when counts are rescaled by n (see Insets). In Ref. [92] we have shown identical behavior for spin glasses as well as for our heuristic model [101] of RD, revealing a property renewal processes [105][106][107] can not reproduce without extra assumptions [107]. The logarithmic growth of the particles' MSD, resembling the data in Fig. 19, is also measured and displayed in [51,91]. Here we simply note that the 'long jumps' associated to quakes have an age independent PDF. Hence, each jump increments the MSD by an age independent amount, and the MSD and the number of quakes are simply proportional.

origin.
A clear distinction is made between the fluctuation dynamics within each equilibrium state and the dynamics of the quakes which punctuate them. RD posits that quakes are a log-Poisson process [104], whose decelerating nature stems from the record high free energy barriers which must be successively overcome to destroy an equilibrium state and generate the next. This mesoscopic description can be verified, possibly falsified, by investigating the statistics of alleged quakes in observational and computational data.
SOC and RD are both coarse-grained statistical descriptions of stationary respectively non-stationary dynamics of complex systems. In spite of this important difference, they both rely on the presence of a multitude of marginally stable metastable attractors [23]. These attractors are fixed in SOC, but gradually evolve in RD, where quake triggering configuration changes are systematically demoted to reversible fluctuations. This is implied by the fact that a record barrier only counts as such at the first crossing, and reflects the hierarchical nature of the underlying landscape.
Continuous Time Random Walks (CTRW) [1,2,111,112] describe metastable system in terms of jumps from trap to trap characterized by a fixed waiting time distribution. Once a jump has occurred, the system is back to square one and ready to jump again in the same fashion. CTRW are thus renewal processes, which produce, as RD also does, the sub-diffusive behavior they were designed for, if the waiting time probability density function is chosen to be a power-law. However, CTRW do not offer case specific microscopic justifications of this choice.
In spite of some superficial similarities, CTRW and RD rely on orthogonal physical pictures of complex behavior [113]. Whether one or the other should be chosen in a specific application must rest on empirical evidence [114] and, mainly, on the philosophical inclinations of the investigator.
In the RD applications presented, the focus has been i) to provide the physical background, ii) to show how the log-Poisson nature of the quaking process can be extracted from empirical data and iii) how RD predicts macroscopic behavior.
One RD feature which plays a role in spatially extended system, and which has not been highlighted so far, stems from the simple mathematical fact that the sum of two independent Poisson processes with averages m 1 and m 2 is itself a Poisson process with average m 1 + m 2 . An extended system with short ranged interactions can be treated as composed of independent parts with linear size of the order of the correlation length. If quaking in each part is described by a log-Poisson process, the same applies to the whole system. Consequently, the quaking rate is an extensive quantity which scales proportionally with the system size. This is as reasonably expected and can be advantageous when analyzing experimental or numerical data from different sources, since the system size dependence can be scaled out. Finally, since the average and variance of a Poisson process are identical, the distribution itself becomes sharp in the thermodynamic limit, which means that RD correctly recognizes the selfaveraging property of large systems.
There are a number of RD applications not discussed for space reasons, of which one, the Parking Lot Model [115] deserves a brief mention because i) its barrier hierarchy is purely entropic and ii) it features a length scale increasing with the system age, a real space property generally expected in evolving systems with growing ergodic components.
In conclusion, RD connects microscopic interactions and macroscopic phenomenology via a mesoscopic description, based on log-time homogeneity of the quake dynamics. To analyze a new problem with RD one needs an hypothesis, which can of course be iteratively improved, on what quakes are and how to identify them. Once this is done the dynamical consequences of the quakes in real as well as configuration space can be studied. The approach is generally applicable to systems going through metastable equilibria and invariably produces a decelerating dynamics.
Interestingly, this is not the case for human cultural evolution, a staged process with rapid transitions between stages, not presently discussed, which is accelerating rather than decelerating. However, as recently argued in [116], the natural variable of that process is not time in SI units, but the number of inter-personal interactions a quantity growing super linearly in time. In this case, an RD description of an underlying decelerating optimization process can lead to a process which accelerates, when described as function of wall-clock time.