Reference Work Entry

Encyclopedia of Complexity and Systems Science

pp 3700-3720

# Fractal Geometry, A Brief Introduction to

• Armin BundeAffiliated withInstitut für Theoretische Physik
• , Shlomo HavlinAffiliated withInstitute of Theoretical Physics, Bar-Ilan‐University

## Definition of the Subject

In this chapter we present some definitions related to the fractal concept as well asseveral methods for calculating the fractal dimension and other relevant exponents. Thepurpose is to introduce the reader to the basic properties of fractals and self‐affinestructures so that this book will be self contained. We do not give references to most of theoriginal works, but, we refer mostly to books and reviews on fractal geometry where theoriginal references can be found.

Fractal geometry isa mathematical tool for dealing with complex systems that have no characteristic lengthscale. A well‐known example is the shape ofa coastline . When we see two picturesof a coastline on two different scales, with 1 cm corresponding for example to0.1 km or 10 km, we cannot tell which scale belongs to which picture: both lookthe same, and this features characterizes also many other geographical patterns likerivers ,cracks ,mountains , andclouds . This means that the coastline is scaleinvariant or, equivalently, has no characteristic length scale. Another example arefinancial records . When looking ata daily, monthly or annual record, one cannot tell the difference. They all look thesame.

Scale‐invariant systems are usually characterized by noninteger (“fractal”) dimensions. The notionof noninteger dimensions and several basic properties of fractal objects were studied as longago as the last century by Georg Cantor, Giuseppe Peano, and David Hilbert, and in thebeginning of this century by Helge von Koch, Waclaw Sierpinski, Gaston Julia, and FelixHausdorff. Even earlier traces of this concept can be found in the study ofarithmetic‐geometric averages by Carl Friedrich Gauss about 200 years ago and in theartwork of Albrecht Dürer (see Fig. 1)about 500 years ago. Georg Friedrich Lichtenberg discovered, about 230 years ago, fractaldischarge patterns. He was the first to describe the observed self‐similarity of thepatterns: A part looks like the whole. Benoit Mandelbrot [1] showed the relevance of fractal geometry to many systemsin nature and presented many important features of fractals. For further books and reviews onfractals see [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16].

Before introducing the concept of fractal dimension, we should like to remind the readerof the concept of dimension in regular systems. It is well known that in regular systems (withuniform density) such as long wires, large thin plates, or large filled cubes, thedimension d characterizes how the mass M(L) changes with the linearsize L of the system. If we consider a smallerpart of the system of linear size bL ($${ b<1 }$$), then M(bL) is decreased by a factor of b d ,i. e.,
$$M(bL) = b^dM(L) \: .$$
(1)
The solution of the functional equation (1) is simply $${ M(L)=AL^{d} }$$. For the long wire the mass changes linearly with b, i. e., $${ d=1 }$$. For the thin plates we obtain $${ d=2 }$$, and for the cubes $${ d=3 }$$; see Fig. 2.

Next we consider fractal objects. Here we distinguish betweendeterministic and random fractals. Deterministic fractals are generated iteratively in a deterministic way, whilerandom fractals are generated using a stochastic process. Although fractal structures innature are random, it is useful to study deterministic fractals where the fractal propertiescan be determined exactly. By studying deterministic fractals one can gain also insight intothe fractal properties of random fractals, which usually cannot be treated rigorously.

## Deterministic Fractals

In this section, we describe several examples of deterministic fractals and use them tointroduce useful fractal concepts such asfractal andchemical dimension, self similarity ,ramification, and fractal substructures (minimum path, external perimeter, backbone, and redbonds).

### The Koch Curve

One of the most common deterministic fractals is the Koch curve. Figure 3 shows the first $${ n=4 }$$ iterations of this fractal curve. By each iteration the length of the curve is increased by a factor of $${ 4/3 }$$. The mathematical fractal is defined in the limit of infinite iterations, $${ n\to \infty }$$, where the total length of the curve approaches infinity.

The dimension of the curve can be obtained as for regular objects. From Fig. 3 we notice that, if we decrease the linear size by a factor of $${ b=1/3 }$$, the total length (mass) of the curve decreases by a factor of $${ 1/4 }$$, i. e.,
$$M \left( \tfrac{1}{3} \, L \right) = \tfrac{1}{4} \, M(L) \: .$$
(2)
This feature is very different from regular curves, where the length of the object decreases proportional to the linear scale. In order to satisfy Eqs. (1) and (2) we are led to introduce a noninteger dimension, satisfying $${ 1/4=(1/3)^d }$$, i. e., $${ d=\log4/\log3 }$$. For such non‐integer dimensions Mandelbrot coined the name “fractal dimension” and those objects described by a fractal dimension are called fractals. Thus, to include fractal structures, Eq. (1) is generalized by
$$M(bL) = b^{d_\mathrm{f}} M(L) \: ,$$
(3)
and
$$M(L) = A L^{d_\mathrm{f}} \: ,$$
(4)
where $${ d_\mathrm{f} }$$ is the fractal dimension.

When generating the Koch curve and calculating $${ d_\mathrm{f} }$$, we observe the striking property of fractals – the property of self‐similarity. If we examine the Koch curve , we notice that there is a central object in the figure that is reminiscent of a snowman. To the right and left of this central snowman there are two other snowmen, each being an exact reproduction, only smaller by a factor of $${ 1/3 }$$. Each of the smaller snowmen has again still smaller copies (by $${ 1/3 }$$) of itself to the right and to the left, etc. Now, if we take any such triplet of snowmen (consisting of $${ 1/3^m }$$ of the curve), for any m, and magnify it by $${ 3^m }$$, we will obtain exactly the original Koch curve. This property of self‐similarity or scale invariance is the basic feature of all deterministic and random fractals: if we take a part of a fractal and magnify it by the same magnification factor in all directions, the magnified picture cannot be distinguished from the original.

For the Koch curve as well as for all deterministic fractals generated iteratively, Eqs. (3) and (4) are of course valid only for length scales L below the linear size L 0 of the whole curve (see Fig. 3). If the number of iterations n is finite, then Eqs. (3) and (4) are valid only above a lower cut off length $${ L_\mathrm{min} }$$, $${ L_\mathrm{min}=L_0/3^n }$$ for the Koch curve. Hence, for a finite number of iterations, there exist two cut-off length scales in the system, an upper cut-off $${ L_\mathrm{max}=L_0 }$$ representing the total linear size of the fractal, and a lower cut-off $${ L_\mathrm{min} }$$. This feature of having two characteristic cut-off lengths is shared by all fractals in nature.

An interesting modification of the Koch curve is shown in Fig. 4, which demonstrates that the chemical distance is an important concept for describing structural properties of fractals (for a review see, for example, [16] and Chap. 2 in [13]). The chemical distance ℓ is defined as shortest path on the fractal between two sites of the fractal. In analogy to the fractal dimension $${ d_\mathrm{f} }$$ that characterizes how the mass of a fractal scales with (air) distance L, we introduce the chemical dimension $${ d_\ell }$$ in order to characterize how the mass scales with the chemical distance ℓ,
$$M(b\ell) = b^{d_\ell} M(\ell) \: , \quad \text{or } \; M(\ell) = B\ell^{d_\ell} \: .$$
(5)
From Fig. 4 we see that if we reduce ℓ by a factor of 5, the mass of the fractal within the reduced chemical distance is reduced by a factor of 7, i. e., $${ M( 1/5 \, \ell)} { = 1/7 \, M(\ell) }$$, yielding $${ d_\ell=\log 7/\log 5\cong 1.209 }$$. Note that the chemical dimension is smaller than the fractal dimension $${ d_\mathrm{f} = \log7/\log4 \cong 1.404 }$$, which follows from $${ M(1/4 \, L) = 1/7 \, M(L) }$$.
The structure of the shortest path between two sites represents an interesting fractal by itself. By definition, the length of the path is the chemical distance ℓ, and the fractal dimension of the shortest path, $${ d_\mathrm{min} }$$, characterizes how ℓ scales with (air) distance L. Using Eqs. (4) and (5), we obtain
$$\ell \sim L^{d_\mathrm{f}/d_\ell} \equiv L^{d_\mathrm{min}} \: ,$$
(6)
from which follows $${ d_\mathrm{min}=d_\mathrm{f}/d_\ell }$$. For our example we find that $${ d_\mathrm{min}=\log 5/\log4\cong 1.161 }$$. For the Koch curve, as well as for any linear fractal, one simply has $${ d_\ell=1 }$$ and hence $${ d_\mathrm{min}=d_\mathrm{f} }$$. Since, by definition, $${ d_\mathrm{min}\ge 1 }$$, it follows that $${ d_\ell \le d_\mathrm{f} }$$ for all fractals.

### The SierpinskiGasket ,Carpet , andSponge

Next we discuss the Sierpinski fractal family: the “gasket”, the “carpet”, and the “sponge”.

The Sierpinski gasket is generated by dividing a full triangle into four smaller triangles and removing the central triangle (see Fig. 5). In the following iterations, this procedure is repeated by dividing each of the remaining triangles into four smaller triangles and removing the central triangles.

To obtain the fractal dimension, we consider the mass of the gasket within a linear size L and compare it with the mass within $${ 1/2 \, L }$$. Since $${ M(1/2 \, L)=1/3 \, M(L) }$$, we have $${ d_\mathrm{f}=\log3/\log2\cong 1.585 }$$. It is easy to see that $${ d_\ell=d_\mathrm{f} }$$ and $${ d_\mathrm{min}=1 }$$.

#### The Sierpinski Carpet

The Sierpinski carpet is generated in close analogy to the Sierpinski gasket.

Instead of starting with a full triangle, we start with a full square, which we divide into n 2 equal squares. Out of these squares we choose k squares and remove them. In the next iteration, we repeat this procedure by dividing each of the small squares left into n 2 smaller squares and removing those k squares that are located at the same positions as in the first iteration. This procedure is repeated again and again.

Figure 6 shows the Sierpinski carpet for $${ n=5 }$$ and the specific choice of $${ k=9 }$$. It is clear that the k squares can be chosen in many different ways, and the fractal structures will all look very different. However, since $${ M(1/n \, L)=} { 1/(n^2-k) \, M(L) }$$ it follows that $${ d_\mathrm{f} =\log (n^2-k)/\log n }$$, irrespective of the way the k squares are chosen. Similarly to the gasket, we have $${ d_\ell=d_\mathrm{f} }$$ and hence $${ d_\mathrm{min}=1 }$$.

In contrast, the external perimeter (“hull”, see also Fig. 1) of the carpet and its fractal dimension $${ d_\mathrm{h} }$$ depend strongly on the way the squares are chosen. The hull consists of those sites of the cluster, which are adjacent to empty sites and are connected with infinity via empty sites. In our example, see Fig. 6, the hull is a fractal with the fractal dimension $${ d_\mathrm{h}=\log 9/\log 5\cong 1.365 }$$. On the other hand, if a Sierpinski gasket is constructed with the $${ k=9 }$$ squares chosen from the center, the external perimeter stays smooth and $${ d_\mathrm{h}=1 }$$.

Although the rules for generating the Sierpinski gasket and carpet are quite similar, the resulting fractal structures belong to two different classes, to finitely ramified and infinitely ramified fractals. A fractal is called finitely ramified if any bounded subset of the fractal can be isolated by cutting a finite number of bonds or sites. The Sierpinski gasket and the Koch curve are finitely ramified, while the Sierpinski carpet is infinitely ramified. For finitely ramified fractals like the Sierpinski gasket many physical properties, such as conductivity and vibrational excitations, can be calculated exactly. These exact solutions help to provide insight onto the anomalous behavior of physical properties on fractals, as was shown in Chap. 3 in [13].

#### The Sierpinski Sponge

The Sierpinski sponge shown in Fig. 7 is constructed by starting from a cube, subdividing it into $${ 3 \times 3 \times 3 = 27 }$$ smaller cubes, and taking out the central small cube and its six nearest neighbor cubes. Each of the remaining 20 small cubes is processed in the same way, and the whole procedure is iterated ad infinitum. After each iteration, the volume of the sponge is reduced by a factor of $${ 20/27 }$$, while the total surface area increases. In the limit of infinite iterations, the surface area is infinite, while the volume vanishes. Since $${ M(1/3 \, L)=1/20 \, M(L) }$$, the fractal dimension is $${ d_\mathrm{f}=\log 20/\log 3\cong2.727 }$$. We leave it to the reader to prove that both the fractal dimension $${ d_\mathrm{h} }$$ of the external surface and the chemical dimension $${ d_\ell }$$ is the same as the fractal dimension $${ d_\mathrm{f} }$$.

Modification of the Sierpinski sponge, in analogy to the modifications of the carpet can lead to fractals, where the fractal dimension of the hull, $${ d_\mathrm{h} }$$, differs from $${ d_\mathrm{f} }$$.

### The Dürer Pentagon

Five‐hundred years ago the artist Albrecht Dürer designed a fractal based on regular pentagons, where in each iteration each pentagon is divided into six smaller pentagons and five isosceles triangles, and the triangles are removed (see Fig. 8). In each triangle, the ratio of the larger side to the smaller side is the famous proportio divina or golden ratio, $${ g\equiv 1/(2 \cos 72^\circ) \equiv (1+\sqrt{5})/2 }$$. Hence, in each iteration the sides of the pentagons are reduced by $${ 1+g }$$. Since $${ M( L/(1+g)) = 1/6 \, M(L) }$$, the fractal dimension of the Dürer pentagon is $${ d_\mathrm{f}=\log 6/\log (1+g)\cong 1.862 }$$. The external perimeter of the fractal (see Fig. 1) forms a fractal curve with $${ d_\mathrm{h}=\log 4/\log(1+g) }$$.

A nice modification of the Dürer pentagon is a fractal based on regular hexagons, where in each iteration one hexagon is divided into six smaller hexagons, six equilateral triangles, and a David-star in the center, and the triangles and the David-star are removed (see Fig. 9). We leave it as an exercise to the reader to show that $${ d_\mathrm{f}=\log6/\log3 }$$ and $${ d_\mathrm{h}=\log4/\log3 }$$.

### The Cantor Set

Cantor sets are examples of disconnected fractals (fractal dust). The simplest set is the triadic Cantor set (see Fig. 10). We divide a unit interval $${ [0,1] }$$ into three equal intervals and remove the central one. In each following iteration, each of the remaining intervals is treated in this way. In the limit of $${ n=\infty }$$ iterations one obtains a set of points. Since $${ M(1/3 \, L)=1/2 \, M(L) }$$, we have $${ d_\mathrm{f}=\log 2/\log 3\cong0.631 }$$, which is smaller than one.

In chaotic systems, strange fractal attractors occur. The simplest strange attractor is the Cantor set. It occurs, for example, when considering the one‐dimensional logistic map
$$x_{t+1} = \lambda x_t(1-x_t) \: .$$
(7)
The index $${ t=0,1,2,\dots }$$ represents a discrete time. For $${ 0\le\lambda\le4 }$$ and x 0 between 0 and 1, the trajectories x t are bounded between 0 and 1. The dynamical behavior of x t for $${ t\to\infty }$$ depends on the parameter λ. Below $${ \lambda_1=3 }$$, only one stable fixed-point exists to which x t is attracted. At λ1, this fixed-point becomes unstable and bifurcates into two new stable fixed‐points. At large times, the trajectories move alternately between both fixed‐points, and the motion is periodic with period 2. At $$\lambda_2=1+\sqrt6\cong 3.449$$ each of the two fixed‐points bifurcates into two new stable fix points and the motion becomes periodic with period 4. As λ is increased, further bifurcation points λ n occur, with periods of $${ 2^n }$$ between λ n and $${ \lambda_{n+1} }$$.

For large n, the differences between $${ \lambda_{n+1} }$$ and λ n become smaller and smaller, according to the law $$\lambda_{n+1}-\lambda_n=(\lambda_{n}-\lambda_{n-1})/\delta$$, where $${ \delta \cong4.6692 }$$ is the so‐called Feigenbaum constant. The Feigenbaum constant is “universal”, since it applies to all nonlinear “single‐hump” maps with a quadratic maximum [17].

At $${ \lambda_\infty\cong3.569\thinspace 945\thinspace 6 }$$, an infinite period occurs, where the trajectories x t move in a “chaotic” way between the infinite attractor points. These attractor points define the strange attractor, which forms a Cantor set with a fractal dimension $${ d_\mathrm{f}\cong 0.538 }$$ [18]. For a further discussion of strange attractors and chaotic dynamics we refer to [3,8,9].

### The Mandelbrot–GivenFractal

This fractal was suggested as a model for percolation clusters and its substructures (see Sect. 3.4 and Chap. 2 in [13]). Figure 11 shows the first three generations of the Mandelbrot–Given fractal [19]. At each generation, each segment of length a is replaced by 8 segments of length $${ a/3 }$$. Accordingly, the fractal dimension is $${ d_\mathrm{f}=\log8/\log3} {\cong 1.893 }$$, which is very close to $${ d_\mathrm{f}=91/46\cong 1.896 }$$ for percolation in two dimensions. It is easy to verify that $${ d_\ell} { = d_\mathrm{f} }$$, and therefore $${ d_\mathrm{min}=1 }$$. The structure contains loops, branches, and dangling ends of all length scales.

Imagine applying a voltage difference between two sites at opposite edges of a metallic Mandelbrot–Given fractal: the backbone of the fractal consists of those bonds which carry the electric current. The dangling ends are those parts of the cluster which carry no current and are connected to the backbone by a single bond only. The red bonds (or singly connected bonds) are those bonds that carry the total current; when they are cut the current flow stops. The blobs , finally, are those parts of the backbone that remain after the red bonds have been removed.

The backbone of this fractal can be obtained easily by eliminating the dangling ends when generating the fractal (see Fig. 12). It is easy to see that the fractal dimension of the backbone is $${ d_\mathrm{B}=\log6/\log3\cong1.63 }$$. The red bonds are all located along the x axis of the figure and form a Cantor set with the fractal dimension $${ d_\mathrm{red}=\log2/\log3\cong0.63 }$$.

### Julia Sets and the Mandelbrot Set

A complex version of the logistic map (7) is
$$z_{t+1} = z_t^2+c \: ,$$
(8)
where both the trajectories z t and the constant c are complex numbers. The question is: if a certain c‑value is given, for example $${ c=-1.5652-i 1.03225 }$$, for which initial values z 0 are the trajectories z t bounded? The set of those values forms the filled-in Julia set, and the boundary points of them form the Julia set.

To clarify these definitions, consider the simple case $${ c=0 }$$. For $${ \vert z_0\vert \mathchar"313E 1 }$$, z t tends to infinity, while for $${ \vert z_0\vert<1 }$$, z t tends to zero. Accordingly, the filled-in Julia set is the set of all points $${ \vert z_0\vert\le1 }$$, the Julia set is the set of all points $${ \vert z_0\vert=1 }$$.

In general, points on the Julia set form a chaotic motion on the set, while points outside the Julia set move away from the set. Accordingly, the Julia set can be regarded as a “repeller” with respect to Eq. (8). To generate the Julia set, it is thus practical to use the inverted transformation
$$z_{t} = \pm \sqrt{ z_{t+1}-c} \: ,$$
(9)
start with an arbitrarily large value for $${ t+1 }$$, and go backward in time. By going backward in time, even points far away from the Julia set are attracted by the Julia set.

For obtaining the Julia set for a given value of c, one starts with some arbitrary value for $${ z_{t+1} }$$, for example, $${ z_{t+1}=2 }$$. To obtain z t , we use Eq. (9), and determine the sign randomly. This procedure is continued to obtain $${ z_{t-1} }$$, $${ z_{t-2} }$$, etc. By disregarding the initial points, e. g., the first 1000 points, one obtains a good approximation of the Julia set.

The Julia sets can be connected (Fig. 13a) or disconnected (Fig. 13b) like the Cantor sets. The self‐similarity of the pictures is easy to see. The set of c values that yield connected Julia sets forms the famous Mandelbrot set. It has been shown by Douady and Hubbard [20] that the Mandelbrot set is identical to that set of c values for which z t converges starting from the initial point $${ z_0=0 }$$. For a detailed discussion with beautiful pictures see [10] and Chaps. 13 and 14 in [3].

## Random Fractal Models

In this section we present several random fractal models that are widely used to mimicfractal systems in nature. We begin with perhaps the simplest fractal model, the randomwalk.

### Random Walks

Imagine a random walker on a square lattice or a simple cubic lattice. In one unit of time, the random walker advances one step of length a to a randomly chosen nearest neighbor site. Let us assume that the walker is unwinding a wire, which he connects to each site along his way. The length (mass) M of the wire that connects the random walker with his starting point is proportional to the number of steps n (Fig. 14) performed by the walker.

Since for a random walk in any d‑dimensional space the mean end-to-end distance R is proportional to $${ n^{1/2} }$$ (for a simple derivation see e. g., Chap. 3 in [13], it follows that $${ M\sim R^2 }$$. Thus Eq. (4) implies that the fractal dimension of the structure formed by this wire is $${ d_\mathrm{f}=2 }$$, for all lattices.

The resulting structure has loops, since the walker can return to the same site. We expect the chemical dimension $${ d_\ell }$$ to be 2 in $${ d=2 }$$ and to decrease with increasing d, since. Loops become less relevant. For $${ d\ge4 }$$ we have $${ d_\ell=1 }$$. If we assume, however, that there is no contact between sections of the wire connected to the same site (Fig. 14b), the structure is by definition linear, i. e., $${ d_\ell=1 }$$ for all d. For more details on random walks and its relation to Brownian motion, see Chap. 5 in [15] and [21].

### Self‐AvoidingWalks

Self‐avoiding walks (SAWs) are defined as the subset of all nonintersecting random walk configurations. An example is shown in Fig. 15a. As was found by Flory in 1944 [22], the end-to-end distance of SAWs scales with the number of steps n as
$$R\sim n^\nu \: ,$$
(10)
with $${ \nu=3/(d+2) }$$ for $${ d\le 4 }$$ and $${ \nu=1/2 }$$ for $${ d \mathchar"313E 4 }$$. Since n is proportional to the mass of the chain, it follows from Eq. (4) that $${ d_\mathrm{f}=1/\nu }$$. Self‐avoiding walks serve as models for polymers in solution, see [23].

Subsets of SAWs do not necessarily have the same fractal dimension. Examples are the kinetic growth walk (KGW) [24] and the smart growth walk (SGW) [25], sometimes also called the “true” or “intelligent” self‐avoiding walk. In the KGW, a random walker can only step on those sites that have not been visited before. Asymptotically, after many steps n, the KGW has the same fractal dimension as SAWs. In $${ d=2 }$$, however, the asymptotic regime is difficult to reach numerically, since the random walker can be trapped with high probability (see Fig. 15b). A related structure is the hull of a random walk in $${ d=2 }$$. It has been conjectured by Mandelbrot [1] that the fractal dimension of the hull is $${ d_\mathrm{h}=4/3 }$$, see also [26].

In the SGW, the random walker avoids traps by stepping only at those sites from which he can reach infinity. The structure formed by the SGW is more compact and characterized by $${ d_\mathrm{f}=7/4 }$$ in $${ d=2 }$$ [25]. Related structures with the same fractal dimension are the hull of percolation clusters (see also Sect. “Percolation”) and diffusion fronts (for a detailed discussion of both systems see also Chaps. 2 and 7 in [13]).

### Kinetic Aggregation

The simplest model of a fractal generated by diffusion of particles is the diffusion ‐limited aggregation (DLA) model, which was introduced by Witten and Sander in 1981 [27]. In the lattice version of the model, a seed particle is fixed at the origin of a given lattice and a second particle is released from a circle around the origin. This particle performs a random walk on the lattice. When it comes to a nearest neighbor site of the seed, it sticks and a cluster (aggregate) of two particles is formed. Next, a third particle is released from the circle and performs a random walk. When it reaches a neighboring site of the aggregate, it sticks and becomes part of the cluster. This procedure is repeated many times until a cluster of the desired number of sites is generated. For saving computational time it is convenient to eliminate particles that have diffused too far away from the cluster (see Fig. 16).

In the continuum (off‐lattice) version of the model, the particles have a certain radius a and are not restricted to diffusing on lattice sites. At each time step, the length ($${ \le \! a }$$) and the direction of the step are chosen randomly. The diffusing particle sticks to the cluster, when its center comes within a distance a of the cluster perimeter. It was found numerically that for off‐lattice DLA, $${ d_\mathrm{f}=1.71\pm0.01 }$$ in $${ d=2 }$$ and $${ d_\mathrm{f}=2.5\pm0.1 }$$ in $${ d=3 }$$ [28,29]. These results may be compared with the mean field result $${ d_\mathrm{f}=(d^2+1)/(d+1) }$$ [30]. For a renormalization group approach, see [31] and references therein. The chemical dimension $${ d_\ell }$$ is found to be equal to $${ d_\mathrm{f} }$$ [32].

Diffusion‐limited aggregation serves as an archetype for a large number of fractal realizations in nature, including viscous fingering, dielectric breakdown, chemical dissolution, electrodeposition, dendritic and snowflake growth, and the growth of bacterial colonies. For a detailed discussion of the applications of DLA we refer to [5,13], and [29]. Models for the complex structure of DLA have been developed by Mandelbrot [33] and Schwarzer et al. [34].

A somewhat related model for aggregation is the cluster ‐cluster aggregation (CCA) [35]. In CCA, one starts from a very low concentration of particles diffusing on a lattice. When two particles meet, they form a cluster of two, which can also diffuse. When the cluster meets another particle or another cluster, a larger cluster is formed. In this way, larger and larger aggregates are formed. The structures are less compact than DLA, with $${ d_\mathrm{f}\cong 1.4 }$$ in $${ d=2 }$$ and $${ d_\mathrm{f}\cong 1.8 }$$ in $${ d=3 }$$. CCA seems to be a good model for smoke aggregates in air and for gold colloids. For a discussion see Chap. 8 in [13].

### Percolation

Consider a square lattice, where each site is occupied randomly with probability p or empty with probability $${ 1-p }$$. At low concentration p, the occupied sites are either isolated or form small clusters (Fig. 17a). Two occupied sites belong to the same cluster, if they are connected by a path of nearest neighbor occupied sites. When p is increased, the average size of the clusters increases. At a critical concentration $${ p_\mathrm{c} }$$ (also called the percolation threshold) a large cluster appears which connects opposite edges of the lattice (Fig. 17b). This cluster is called the infinite cluster, since its size diverges when the size of the lattice is increased to infinity. When p is increased further, the density of the infinite cluster increases, since more and more sites become part of the infinite cluster, and the average size of the finite clusters decreases (Fig. 17c).

The percolation transition is characterized by the geometrical properties of the clusters near $${ p_\mathrm{c} }$$. The probability $${ P_\infty }$$ that a site belongs to the infinite cluster is zero below $${ p_\mathrm{c} }$$ and increases above $${ p_\mathrm{c} }$$ as
$$P_\infty \sim (p-p_\mathrm{c})^\beta \: .$$
(11)
The linear size of the finite clusters, below and above $${ p_\mathrm{c} }$$, is characterized by the correlation length ξ. The correlation length is defined as the mean distance between two sites on the same finite cluster and represents the characteristic length scale in percolation. When p approaches $${ p_\mathrm{c} }$$, ξ increases as
$$\xi\sim |p-p_\mathrm{c}|^{-\nu} \: ,$$
(12)
with the same exponent ν below and above the threshold. While $${ p_\mathrm{c} }$$ depends explicitly on the type of the lattice (e. g., $${ p_\mathrm{c}\cong 0.593 }$$ for the square lattice and $${ 1/2 }$$ for the triangular lattice), the critical exponents β and ν are universal and depend only on the dimension d of the lattice, but not on the type of the lattice.
Near $${ p_\mathrm{c} }$$, on length scales smaller than ξ, both the infinite cluster and the finite clusters are self‐similar. Above $${ p_\mathrm{c} }$$, on length scales larger than ξ, the infinite cluster can be regarded as an homogeneous system which is composed of many unit cells of size ξ. Mathematically, this can be summarized as
$$M(r) \sim \begin{cases} r^{d_\mathrm{f}}\:, & \! \! \! r \ll \xi \: , \\ r^d\:, & \! \! \! r \gg \xi \: . \end{cases}$$
(13)
The fractal dimension $${ d_\mathrm{f} }$$ can be related to β and ν:
$$d_\mathrm{f} = d- \frac{\beta}{\nu} \: .$$
(14)
Since β and ν are universal exponents, $${ d_\mathrm{f} }$$ is also universal. One obtains $${ d_\mathrm{f}=91/48 }$$ in $${ d=2 }$$ and $${ d_\mathrm{f}\cong 2.5 }$$ in $${ d=3 }$$. The chemical dimension $${ d_\ell }$$ is smaller than $${ d_\mathrm{f} }$$, $${ d_\ell\cong 1.15 }$$ in $${ d=2 }$$ and $${ d_\ell\cong 1.33 }$$ in $${ d=3 }$$. A large percolation cluster in $${ d=3 }$$ is shown in Fig. 18.

Interestingly, a percolation cluster is composed of several fractal sub‐structures such as the backbone , dangling ends , blobs , External perimeter , and the red bonds , which are all described by different fractal dimensions.

The percolation model has found applications in physics, chemistry, and biology, where occupied and empty sites may represent very different physical, chemical, or biological properties. Examples are the physics of two component systems (the random resistor, magnetic or superconducting networks), the polymerization process in chemistry, and the spreading of epidemics and forest fires. For reviews with a comprehensive list of references, see Chaps. 2 and 3 of [13] and [36,37,38].

## How to Measure the Fractal Dimension

One of the most important “practical” problems is to determine the fractaldimension $${d_\mathrm{f} }$$ of either a computer generatedfractal or a digitized fractal picture. Here we sketch the two most useful methods: the“sandbox” method and the “box counting” method.

### The Sandbox Method

To determine $${ d_\mathrm{f} }$$, we first choose one site (or one pixel) of the fractal as the origin for n circles of radii $${ R_1<R_2<\dots<R_n }$$, where R n is smaller than the radius R of the fractal, and count the number of points (pixels) $${ M_1(R_i) }$$ within each circle i. (Sometimes, it is more convenient to choose n squares of side length L 1L n instead of the circles.) We repeat this procedure by choosing randomly many other (altogether m) pixels as origins for the n circles and determine the corresponding number of points $${ M_j(R_i) }$$, $${ j=2,3,\dots,m }$$ within each circle (see Fig. 19a). We obtain the mean number of points $${ M(R_i) }$$ within each circle by averaging, $${ M(R_i) = 1/m \, \sum_{j=1}^m M_j(R_i) }$$, and plot $${ M(R_i) }$$ versus R i in a double logarithmic plot. The slope of the curve, for large values of R i , determines the fractal dimension.

In order to avoid boundary effects, the radii must be smaller than the radius of the fractal, and the centers of the circles must be chosen well inside the fractal, so that the largest circles will be well within the fractal. In order to obtain good statistics, one has either to take a very large fractal cluster with many centers of circles or many realizations of the same fractal.

### The Box Counting Method

We draw a grid on the fractal that consists of $${ N_1^2 }$$ squares, and determine the number of squares $${ S(N_1) }$$ that are needed to cover the fractal (see Fig. 19b). Next we choose finer and finer grids with $${ N_1^2<N_2^2<N_3^2<\dots<N_m^2 }$$ squares and calculate the corresponding numbers of squares $${ S(N_1) }$$$${ S(N_m) }$$ needed to cover the fractal. Since S(N) scales as
$$S(N) \sim N^{-d_\mathrm{f}} \: ,$$
(15)
we obtain the fractal dimension by plotting S(N) versus $${ 1/N }$$ in a double logarithmic plot. The asymptotic slope, for large N, gives $${ d_\mathrm{f} }$$.

Of course, the finest grid size must be larger than the pixel size, so that many pixels can fall into the smallest square. To improve statistics, one should average S(N) over many realizations of the fractal. For applying this method to identify self‐similarity in real networks, see Song et al. [39].

## Self‐AffineFractals

The fractal structures we have discussed in the previous sections areself‐similar: if we cut a small piece out of a fractal and magnify itisotropically to the size of the original, both the original and the magnification look thesame. By magnifying isotropically, we have rescaled the x, y, and z axis by the same factor.

There exist, however, systems that are invariant only under anisotropicmagnifications. These systems are called self‐affine [1]. A simple model for a self‐affine fractal is shown inFig. 20. The structure is invariant under the anisotropicmagnification $${ x\to 4x }$$, $${ y\to 2y }$$. If we cut a small piece out of the original picture (inthe limit of $${ n\to\infty }$$ iterations), and rescale the xaxis by a factor of four and the y axis by a factor of two, wewill obtain exactly the original structure. In other words, if we describe the form of the curve inFig. 20 by the function F(x), this function satisfies the equation$${ F(4x)=2F(x)=4^{1/2}F(x) }$$.

In general, if a self‐affine curve is scale invariant under the transformation $${ x \to bx }$$, $${ y\to ay }$$, we have
$$F(bx) = aF(x) \equiv b^H F(x) \: ,$$
(16)
where the exponent $${ H=\log a/\log b }$$ is called the Hurst exponent [1]. The solution of the functional equation (16) is simply $${ F(x)=Ax^H }$$. In the example of Fig. 20, $${ H=1/2 }$$.

Next we consider random self‐affine structures, which are used as models forrandom surfaces. The simplest structure is generated by a one‐dimensional randomwalk, where the abscissa is the time axis and the ordinate is the displacement $${ Z(t)=\sum_{i=1}^t e_i }$$ of the walker from its starting point. Here,$${ e_i=\pm 1 }$$ is the unit step made by the random walker attime t. Since different steps of the random walker areuncorrelated, $${\langle e_ie_j\rangle =\delta_{ij} }$$, it follows thatthe root mean square displacement $${ F(t)\equiv \langle Z^2(t)\rangle^{1/2}=t^{1/2} }$$, and the Hurst exponent of the structure is$${ H=1/2 }$$.

Next we assume that different steps iand j are correlated in such a way that$${ \langle e_ie_j\rangle = b\vert i-j \vert^{-\gamma} }$$,$${ 1 \mathchar"313E\gamma \ge 0 }$$. To see how the Hurst exponent dependson γ, we have to evaluate again $${ \langle Z^2(t)\rangle=\sum_{i,j}^t \langle e_ie_j\rangle }$$. For calculating the double sum it is convenient tointroduce the Fourier transform of e i , $${ e_\omega=(1/\Omega)^{1/2}\sum_{l=1}^\Omega e_l \, \exp(-i\omega l)}$$, where Ω is the number of sites in thesystem. It is easy to verify that $${ \langle Z^2(t)\rangle }$$ canbe expressed in terms of the power spectrum $${ \langle e_\omega e_{-\omega}\rangle }$$ [40]:
$$\langle Z^2(t)\rangle = \frac{1}{\Omega} \sum_\omega \langle e_\omega e_{-\omega}\rangle\vert f(\omega,t)\vert^2 \: ,$$
(17a)
where
$$f(\omega,t) \equiv \frac{\text{e}^{-i\omega(t+1)}-1}{\text{e}^{-i\omega}-1} \: .$$
Since the power spectrum scales as
$$\langle e_\omega e_{-\omega}\rangle\sim \omega^{-(1-\gamma)} \: ,$$
(17b)
the integration of (17a) yields, for large t,
$$\langle Z^2(t)\rangle\sim t^{2-\gamma} \: .$$
(17c)
Therefore, the Hurst exponent is $${ H=(2-\gamma)/2 }$$. According to Eq. (17c), for $${ 0<\gamma<1 }$$, $${ \langle Z^2(t)\rangle }$$ increases faster in time than the uncorrelated random walk. The long‐range correlated random walks were called fractional Brownian motion by Mandelbrot [1].

There exist several methods to generate correlated random surfaces. We shall describethe successive random additions method [41], which iteratively generates the self‐affinefunction Z(x) in theunit interval $${ 0\le x\le 1 }$$. An alternative method that is detailed inthe chapter of Jan Kantelhardt is the Fourier‐filtering technique and itsvariants.

In the $${ n=0 }$$ iteration, we start at the edges $${ x=0 }$$ and $${ x=1 }$$ of the unit intervaland choose the values of $${ Z(0) }$$ and $${ Z(1) }$$ from a distribution with zero mean and variance$${ \sigma_0^2=1 }$$ (see Fig. 21). In the $${ n=1 }$$ iteration, we choosethe midpoint $${ x=1/2}$$ and determine $${ Z(1/2) }$$ by linear interpolation, i. e., $${ Z(1/2)=(Z(0)+Z(1))/2 }$$, and add to all so-far calculated Z values ($${ Z(0) }$$, $${ Z(1/2) }$$, and $${ Z(1) }$$) randomdisplacements from the same distribution as before, but with a variance $${ \sigma_1=(1/2)^H }$$ (see Fig. 21). In the $${ n=2 }$$ iteration we againfirst choose the midpoints ($${ x=1/4 }$$ and $${ x=3/4 }$$), determine their Z values by linear interpolation, and add to all so-farcalculated Z values random displacements from the samedistribution as before, but with a variance $${ \sigma_2=(1/2)^{2H} }$$. In general, in the nth iteration, one first interpolates the Z values of the midpoints and then adds random displacements to allexisting Z values, with variance $${ \sigma_n=(1/2)^{nH} }$$. The procedure is repeated until the requiredresolution of the surface is obtained. Figure 22 shows the graphs of three random surfaces generated thisway, with $${ H=0.2 }$$, $${ H=0.5 }$$, and$${ H=0.8 }$$.

The generalization of the successive random addition method to two dimensions isstraightforward (see Fig. 21). We considera function $${Z(x,y) }$$ on the unit square $${ 0\le x,y \le 1 }$$. In the $${ n=0 }$$ iteration, we start with the four corners$${ (x,y)=(0,0), (1,0),(1,1), (0,1) }$$ of the unit square and choosetheir Z values from a distribution with zero meanand variance $${\sigma_0^2=1 }$$ (see Fig. 21). In the $${ n=1 }$$ iteration, we choosethe midpoint at $${(x,y)=(1/2,1/2) }$$ and determine $${ Z(1/2,1/2) }$$ by linear interpolation, i. e., $${ Z(1/2,1/2) \! = \! (Z(0,0) \! + \!\! Z(0,1) \! + \! \! Z(1,1) \! + \! \! Z(1,0))/4 }$$. Then we add to all so far calculated Z‑values ($${ Z(0,0) }$$, $${ Z(0,1) }$$, $${ Z(1,0) }$$, $${ Z(1,1) }$$ and $${ Z(1/2,1/2) }$$) randomdisplacements from the same distribution as before, but with a variance $${ \sigma_1=(1/\sqrt 2)^H }$$ (see Fig. 21). In the $${ n=2 }$$ iteration we againchoose the midpoints of the five sites $${ (0,1/2) }$$, $${ (1/2,0) }$$, $${ (1/2,1) }$$ and$${ (1,1/2) }$$, determine their Z value by linear interpolation, and add to all so farcalculated Z values random displacements from the samedistribution as before, but with a variance $${ \sigma_2=(1/\sqrt 2)^{2H} }$$. This procedure is repeated again and again, until therequired resolution of the surface is obtained.

At the end of this section we like to note that self‐similar or self‐affinefractal structures with features similar to those fractal models discussed above can be foundin nature on all, astronomic as well as microscopic, length scales. Examples include clustersof galaxies (the fractal dimension of the mass distribution is about 1.2 [42]), the crater landscape of the moon, the distributionof earthquakes (see Chap. 2 in [15]), and the structure of coastlines, rivers, mountains,and clouds. Fractal cracks (see, for example, Chap. 5 in [13]) occur on length scales ranging from103 km (like the San Andreas fault) to micrometers (likefractures in solid materials) [44].

Many naturally growing plants show fractal structures, examples range from trees and theroots of trees to cauliflower and broccoli. The patterns of blood vessels in the human body,the kidney, the lung, and some types of nerve cells have fractal features (see Chap. 3in [15]). In materials sciences,fractals appear in polymers, gels, ionic glasses, aggregates, electro‐deposition, roughinterfaces and surfaces (see [13] andChaps. 4 and 6 in [15]), as well asin fine particle systems [43]. In allthese structures there is no characteristic length scale in the system besides the physicalupper and lower cut-offs.

The occurrence of self‐similar or self‐affine fractals is not limited tostructures in real space as we will discuss in the next section.

## Long-Term Correlated Records

Long-range dependencies as described in the previous section do not only occur insurfaces. Of great interest is long-term memory in climate, physiology and financial markets, the examples rangefrom river floods [45,46,47,48,49],temperatures [50,51,52,53,54],and wind fields [55] to marketvolatilities [56], heart-beatintervals [57,58] and internet traffic [59].

Consider a record x i of discrete numbers, where the index i runs from 1 to N. x i may bedaily or annual temperatures, daily or annual river flows, or any other set of data consistingof N successive data points. We are interested in thefluctuations of the data around their (sometimes seasonal) mean value. Without loss ofgenerality, we assume that the mean of the data is zero and the variance equal to one. Inanalogy to the previous section, we call the data long-term correlated, when the correspondingautocorrelation function $${ C_x(s) }$$ decays by a power law,
$$C_x(s) = \langle x_i x_{i+s}\rangle\equiv \frac{1}{N - s} \, \sum^{N-s}_{i=1} x_1 x_{i+s} \sim s^{-\gamma} \: ,$$
(18)
where γ denotes the correlation exponent, $${ 0 < \gamma < 1 }$$. Such correlations are named ‘long-term’ since the mean correlation time $${ T = \int^\infty_0 C_x(s) \, \mathrm{d}s }$$ diverges in the limit of infinitely long series where $${ N \rightarrow \infty }$$. If the x i are uncorrelated, $${ C_x(s) = 0 }$$ for $${ s \mathchar"313E 0 }$$. More generally, if correlations exist up to a certain correlation time s x , then $${ C(s) \mathchar"313E 0 }$$ for $${ s < s_x }$$ and $${ C(s)= 0 }$$ for $${ s \mathchar"313E s_x }$$.

Figure 23 shows parts of anuncorrelated (left) and a long-term correlated(right) record, with $${ \gamma = 0.4 }$$; both series have been generated by the computer. Thered line is the moving average over 30 data points. For the uncorrelated data, the movingaverage is close to zero, while for the long-term correlated data set, the moving average canhave large deviations from the mean, forming some kind of mountain valley structure. Thisstructure is a consequence of the power-law persistence. The mountains and valleys inFig. 23b look as if they had been generatedby external trends, and one might be inclined to draw a trend-line and to extrapolate theline into the near future for some kind of prognosis. But since the data are trend-free, onlya short-term prognosis utilizing the persistence can be made, and nota longer‐term prognosis, which often is the aim of such a regressionanalysis.

Alternatively, in analogy to what we described above for self‐affine surfaces, onecan divide the data set in K s equidistant windows of length s and determine in each window ν the squared sum
$$F_\nu^2(s) = \left( \sum_{i=1}^s x_i \right)^2$$
(19a)
and detect how the average of this quantity over all windows, $${ F^2(s)= 1/K_s \, \sum_{\nu=1}^{K_s} F_\nu^2(s) }$$, scales with the window size s. For long-term correlated data one can show that $${ F^2(s) }$$ scales as $${ \langle Z^2(t)\rangle }$$ in the previous section, i. e.
$$F^2 (s) \sim s^{2\alpha} \: ,$$
(19b)
where $${ \alpha = 1 - \gamma/2 }$$. This relation represents an alternative way to determine the correlation exponent γ.

Since trends resemble long-term correlations and vice versa, there is a generalproblem to distinguish between trends and long-term persistence. In recent years, severalmethods have been developed, mostly based on the hierarchical detrended fluctuation analysis(DFAn) where long-term correlations in the presence ofsmooth polynomial trends of order $${ n-1 }$$ can bedetected [57,58,60](see also Fractal and Multifractal Time Series). In DFAn, one considers the cumulated sum (“profile”) of thex i and divides the N data points of the profile into equidistant windows of fixedlength s. Then one determines, in each window, thebest fit of the profile by an nth order polynomial anddetermines the variance around the fit. Finally, one averages these variances to obtain themean variance $${F_{(n)}^2 }$$ and the corresponding mean standarddeviation (mean fluctuation) $${ F_{(n)}(s) }$$. One can showthat for long-term correlated trend-free data $${ F_{(n)}(s) }$$ scales withthe window size s as F(s) in Eq. (19b), i. e., $${ F_{(n)} (s) \sim s^\alpha }$$, with $${ \alpha = 1 - \gamma/2 }$$,irrespective of the order of the polynomial n. Forshort-term correlated records (including the case $${ \gamma \geq 1 }$$), the exponent is $${ 1/2 }$$ for s aboves x . It is easy to verify that trends of order$${ k - 1 }$$ in the original data are eliminated in $${ F_{(k)}(s) }$$ but contribute to $${ F_{(k-1)}, F_{(k-2)} }$$ etc., and this allows one to determine the correlationexponent γ in the presence of trends. For example, in the case of a lineartrend, DFA0 and DFA1 (where $${ F_{(0)}(s) }$$ and$${ F_{(1)}(s) }$$ are determined) are affected by the trend and willexaggerate the asymptotic exponents α, while DFA2, DFA3 etc. (where $${ F_{(2)}(s) }$$ and $${ F_{(3)}(s) }$$ etc. isdetermined) are not affected by the trend and will show, in a double logarithmic plot,the same value of α, which then gives immediately the correlationexponent γ. When γ is known this way, one can try to detect the trend, butthere is no unique treatment available. In recent papers [61,62,63],different kinds of analysis have been elaborated and applied to estimate trends in thetemperature records of the Northern Hemisphere and Siberian locations.

### ClimateRecords

Figure 24 shows representative results of the DFAn analysis, for temperature, precipitation and run-off data. For continental temperatures, the exponent α is around 0.65, while for island stations and sea surface temperatures the exponent is considerably higher. There is no crossover towards uncorrelated behavior at larger time scales. For the precipitation data, the exponent is close to 0.55, not being significantly larger than for uncorrelated records.

Figure 25 shows a summary of the exponent α for a large number of climate records. It is interesting to note that while the distribution of α‑values is quite broad for run-off, sea‐surface temperature, and precipitation records, the distribution is quite narrow, located around $${ \alpha = 0.65 }$$ for continental atmospheric temperature records. For the island records, the exponent is larger. The quite universal exponent $${ \alpha = 0.65 }$$ for continental stations can be used as an efficient test bed for climate models  [62,64,65].

The time window accessible by DFAn is typically $${ 1/4 }$$ of the length of the record. For instrumental records, the time window is thus restricted to about 50 years. For extending this limit, one has to take reconstructed records or model data, which range up to 2000y. Both have, of course, large uncertainties, but it is remarkable that exactly the same kind of long-term correlations can be found in these data, thus extending the time scale where long-term memory exists to at least 500y [61,62].

### Clustering of Extreme Events

Next we consider the consequences of long-term memory on the occurrence of rare events. Understanding (and predicting) the occurrence of extreme events is one of the major challenges in science (see, e. g., [68]). An important quantity here is the time interval between successive extreme events (see Fig. 26), and by understanding the statistics of these return intervals one aims to better understand the occurrence of extreme events.

Since extreme events are, by definition, very rare and the statistics of their return intervals poor, one usually studies also the return intervals between less extreme events, where the data are above some threshold q and where the statistics is better, and hopes to find some general “scaling” relations between the return intervals at low and high thresholds, which then allows one to extrapolate the results to very large, extreme thresholds (see Fig. 26).

For uncorrelated data, the return intervals are independent of each other and their probability density function (pdf) is a simple exponential, $${ P_q(r) = (1/R_q) } { \times \exp(-r/R_q) }$$. In this case, all relevant quantities can be derived from the knowledge of the mean return interval R q . Since the return intervals are uncorrelated, a sequential ordering cannot occur. There are many cases, however, where some kind of ordering has been observed where the hazardous events cluster, for example in the floods in Central Europe during the middle ages or in the historic water levels of the Nile river which are shown in Fig. 26 for 663y. Even by eye one can realize that the events are not distributed randomly but are arranged in clusters. A similar clustering was observed for extreme floods, winter storms, and avalanches in Central Europe (see, e. g., Figs. 4.4, 4.7, 4.10, and 4.13 in [69], Fig. 66 in [70], and Fig. 2 in [71]). The reason for this clustering is the long-term memory.

Figure 27 shows $${ P_q(r) }$$ for long-term correlated records with $${ \alpha = 0.4 }$$ (corresponding to $${ \gamma = 0.8 }$$), for three values of the mean return interval R q (which is easily obtained from the threshold q and independent of the correlations). The pdf is plotted in a scaled way, i. e., $${ R_q P_q(r) }$$ as a function of $${ r/R_q }$$. The figure shows that all three curves collapse. Accordingly, when we know the functional form of the pdf for one value of R q , we can easily deduce its functional form also for very large R q values which due to its poor statistics cannot be obtained directly from the data. This scaling is a very important property, since it allows one to make predictions also for rare events which otherwise are not accessible with meaningful statistics. When the data are shuffled, the long-term correlations are destroyed and the pdf becomes a simple exponential.

The functional form of the pdf is a quite natural extension of the uncorrelated case. The figure suggests that
$$\ln P_q(r) \sim - (r/R_q)^\gamma$$
(20)
i. e. simple stretched exponential behavior [72,73]. For γ approaching 1, the long-term correlations tend to vanish and we obtain the simple exponential behavior characteristic for uncorrelated processes. For r well below R q , however, there are deviations from the pure stretched exponential behavior. Closer inspection of the data shows that for $${ r/R_q\ll }$$ the decay of the pdf is characterized by a power law, with the exponent $${ \gamma - 1 }$$. This overall behavior does not depend crucially on the way the original data are distributed. In the cases shown here, the data had a Gaussian distribution, but similar results have been obtained also for exponential, power-law and log‐normal distributions [74]. Indeed, the characteristic stretched exponential behavior of the pdf can also be seen in long historic and reconstructed records [73].

The form of the pdf indicates that return intervals both well below and well above their average value are considerably more frequent for long-term correlated data than for uncorrelated data. The distribution does not quantify, however, if the return intervals themselves are arranged in a correlated or in an uncorrelated fashion, and if clustering of rare events may be induced by long-term correlations.

To study this question, [73] and [74] have evaluated the autocorrelation function of the return intervals in synthetic long-term correlated records. They found that also the return intervals are arranged in a long-term correlated fashion, with the same exponent as the original data. Accordingly, a large return interval is more likely to be followed by a large one than by a short one, and a small return interval is more likely to be followed by a small one than by a large one, and this leads to clustering of events above some threshold q, including extreme events.

As a consequence of the long-term memory , the probability of finding a certain return interval depends on the preceding interval. This effect can be easily seen in synthetic data sets generated numerically, but not so well in climate records where the statistics is comparatively poor. To improve the statistics, we now only distinguish between two kinds of return intervals, “small” ones (below the median) and “large” ones (above the median), and determine the mean $${ R^+_q }$$ and $${ R^-_q }$$ of those return intervals following a large $${ (+) }$$ or a small $${ (-) }$$ return interval. Due to scaling, $${ R^+_q/R_q }$$ and $${ R^-_q/R_q }$$ are independent of q. Figure 28 shows both quantities (calculated numerically for long-term correlated Gaussian data) as a function of the correlation exponent γ. The lower dashed line is $${ R^-_q/R_q }$$, the upper dashed line is $${ R^+_q/R_q }$$. In the limit of vanishing long-term memory, for $${ \gamma = 1 }$$, both quantities coincide, as expected. Figure 28 also shows $${ R^+_q/R_q }$$ and $${ R^-_q/R_q }$$ for five climate records with different values of γ. One can see that the data agree very well, within the error bars, with the theoretical curves.

## Long-Term Correlations in Financial Markets and Seismic Activity

The characteristic behavior of the return intervals , i. e. long-term correlations and stretched exponentialdecay, can also be observed in financial markets and seismic activity. It is well known (see,e. g. [56]) that the volatility ofstocks and exchange rates is long-term correlated. Figure 29 shows that, as expected from the foregoing, also the returnintervals between daily volatilities are long-term correlated, with roughly the sameexponent γ as the original data [75]. It has further been shown [75] that also the pdfs show the characteristic behaviorpredicted above.

A further example where long-term correlations seem to play an important role, areearthquakes in certain bounded areas (e. g. California) in time regimes where theseismic activity is (quasi) stationary. It has been discovered recently by [76] that the magnitudes of earthquakes in Northern andSouthern California, from 1995 until 1998, are long-term correlated with an exponent around$${ \gamma = 0.4 }$$, and that also the return intervals between theearthquakes are long-term correlated with the same exponent. For the given exponentialdistribution of the earthquake magnitudes (following the Gutenberg–Richter law), thelong-term correlations lead to a characteristic dependence on the scaled variable$${ r/R_q }$$ which can explain, without any fit parameter, theprevious results on the pdf of the return intervals by [77].

## Multifractal Records

Many records do not exhibit a simple monofractal scaling behavior, which can beaccounted for by a single scaling exponent. In some cases, there exist crossover(time‑) scales s x separating regimes with different scaling exponents,e. g. long-term correlations on small scales below s x andanother type of correlations or uncorrelated behavior on larger scales above s x . Inother cases, the scaling behavior is more complicated, and different scaling exponents arerequired for different parts of the series. In even more complicated cases, such differentscaling behavior can be observed for many interwoven fractal subsets of the time series. Inthis case a multitude of scaling exponents is required for a full description of thescaling behavior, and a multifractal analysis must be applied (see,e. g., [78,79] and literature therein).

To see this, it is meaningful to extend Eqs. (19a) and (19b) by considering the more general average
$${F^q(s)} = \frac{1}{K_s} \, \sum_{\nu=1}^{K_s} \left[ F_\nu^2(s) \right]^{q/2}$$
(21)
with q between $${ -\infty }$$ and $${ +\infty }$$. For $${ q\ll-1 }$$ the small fluctuations will dominate the sum, while for $${ q\gg1 }$$ the large fluctuations are dominant. It is reasonable to assume that the q‑dependent average scales with s as
$${F^q(s)}\sim s^{q\beta(q)} \: ,$$
(22)
with $${ \beta(2)=\alpha }$$. Equation (22) generalizes Eq. (19b). If $${ \beta(q) }$$ is independent of q, we have $${ ({F^q(s)})^{1/q}\sim s^{\alpha} }$$, independent of q, and both large and small fluctuations scale the same. In this case, a single exponent is sufficient to characterize the record, which then is referred to as monofractal . If $${ \beta(q) }$$ is not identical to α, we have a  multifractal  [1,4,12]. In this case, the dependence of $${ \beta(q) }$$ on q characterizes the record. Instead of $${ \beta(q) }$$ one considers frequently the spectrum $${ f(\omega) }$$ that one obtains by Legendre transform of $${ q\beta(q) }$$, $${ \omega = \mathrm{d}(q \beta(q)) / \mathrm{d}q }$$, $${ f(\omega)=q\omega-q\beta(q) + 1 }$$. In the monofractal limit we have $${ f(\omega)=1 }$$.
For generating multifractal data sets, one considers mostly multiplicative randomcascade processes, described, e. g., in [3,4]. Inthis process, the data set is obtained in an iterative way, where the length of the recorddoubles in each iteration. It is possible to generate random cascades with vanishingautocorrelation function ($${ C_x(s)=0 }$$ for $${ s\ge1 }$$) or withalgebraically decaying autocorrelation functions ($${ C_x(s)\sim s^{-\gamma} }$$). Here we focus on a multiplicative randomcascade with vanishing autocorrelation function, which is particularly interesting since itcan be used as a model for the arithmetic returns $${ (P_i - P_{i-1})/P_i }$$ of daily stock closing prices P i  [80]. In the zeroth iteration $${ n = 0 }$$, the data set $${ (x_i) }$$ consists of one value, $${ x_1^{(n=0)} = 1 }$$. In the nthiteration, the data $${x_i^{(n)} }$$ consist of $${ 2^{n} }$$ values that are obtained from
$$x_{2l - 1}^{(n)} = x_l^{(n-1)} m_{2l - 1}^{(n)}$$
and
$$x_{2l}^{(n)} = x_l^{(n-1)} m_{2l}^{(n)} \: ,$$
(23)
where the multipliers m are independent and identically distributed (i.i.d.) random numbers with zero mean and unit variance. The resulting pdf is symmetric with log‐normal tails, with vanishing correlation function $${ C_x(s) }$$ for $${ s\ge1 }$$.
It has been shown that in this case, the pdf of the returnintervals decays by a power-law
$$P_q(r) \sim \left( \frac{r}{R_q} \right)^{-\delta(q)} \: ,$$
(24)
where the exponent δ depends explicitly on R q and seems to converge to a limiting curve for large data sets. Despite of the vanishing autocorrelation function of the original record, the autocorrelation function of the return intervals decays by a power law with a threshold‐dependent exponent [80]. Obviously, these long-term correlations have been induced by the nonlinear correlations in the multifractal data set. Extracting the return interval sequence from a data set is a nonlinear operation, and thus the return intervals are influenced by the nonlinear correlations in the original data set. Accordingly, the return intervals in data sets without linear correlations are sensitive indicators for nonlinear correlations in the data records. The power-law dependence of $${ P_q(r) }$$ can be used for an improved risk estimation . Both power-law dependencies can be observed in economic and physiological records that are known to be multifractal  [81].

## Acknowledgments

We like to thank all our coworkers in this field, in particular Eva Koscielny‐Bunde, Mikhail Bogachev, Jan Kantelhardt, Jan Eichner, Diego Rybski, Sabine Lennartz, Lev Muchnik, Kazuko Yamasaki, John Schellnhuber and Hans von Storch.