25 Years of Self-Organized Criticality: Concepts and Controversies

Introduced by the late Per Bak and his colleagues, self-organized criticality (SOC) has been one of the most stimulating concepts to come out of statistical mechanics and condensed matter theory in the last few decades, and has played a significant role in the development of complexity science. SOC, and more generally fractals and power laws, have attacted much comment, ranging from the very positive to the polemical. The other papers in this special issue (Aschwanden et al, 2014; McAteer et al, 2014; Sharma et al, 2015) showcase the considerable body of observations in solar, magnetospheric and fusion plasma inspired by the SOC idea, and expose the fertile role the new paradigm has played in approaches to modeling and understanding multiscale plasma instabilities. This very broad impact, and the necessary process of adapting a scientific hypothesis to the conditions of a given physical system, has meant that SOC as studied in these fields has sometimes differed significantly from the definition originally given by its creators. In Bak's own field of theoretical physics there are significant observational and theoretical open questions, even 25 years on (Pruessner, 2012). One aim of the present review is to address the dichotomy between the great reception SOC has received in some areas, and its shortcomings, as they became manifest in the controversies it triggered. Our article tries to clear up what we think are misunderstandings of SOC in fields more remote from its origins in statistical mechanics, condensed matter and dynamical systems by revisiting Bak, Tang and Wiesenfeld's original papers.

In particular, SOC has become a research field in laboratory fusion plasmas, solar physics and magnetospheric physics, reviewed in the complementary papers McAteer et al., 2014;Sharma et al., 2015) in this volume. Like them, our own paper results from two workshops at the International Space Science Institute in 2012 and 2013 (http://www.issibern.ch/ teams/s-o-turbulence/).
Despite its success, however, SOC has often divided opinions, even among experts. It has attracted significant criticism (e.g. Perković et al., 1995;Krommes, 2000;Frigg, 2003;Stumpf and Porter, 2012), some of it deserved, some of it polemical, to such a degree that some authors in condensed matter physics will avoid mentioning it altogether (e.g. Alvarado et al., 2013). Although numerous reviews (Turcotte, 1999;Alstrøm et al., 2004;Sornette, 2006;Dhar, 2006) and several booklength surveys of theory (Jensen, 1998;Pruessner, 2012;Christensen and Moloney, 2005) and applications (Hergarten, 2002;Aschwanden, 2011;Rodríguez-Iturbe and Rinaldo, 2001) exist, the enduring "hectic air of controversy" (Jensen, 1998, p. 125) has ensured that many people remain uncertain both of SOC's long term status, and of its net contribution to science. This will undoubtedly also be true for many readers of Space Science Reviews, browsing the group of papers centered on SOC in space and plasma physics that are collected in this volume.
Our contribution to the present volume aims to both complement the surveys of SOC in space and lab plasmas in the accompanying papers and to address this con-troversy. We first, in Section 2, discuss why the multiscale avalanching paradigm, of which SOC is the best known example, is relevant to both space and laboratory plasmas. We then, in Section 3, clarify just what kind of "SOC" we are talking about in this paper, by distinguishing as briefly as we can between the main different perceptions of SOC (see Figure 1) that one can find in various research disciplines, including space plasma physics. The first of these pictures is BTW SOC, the SOC that was introduced in BTW's original papers. It has a theoretical core underpinning it which not only remains essentially intact but also has been substantially clarified over 25 years, and so it is the SOC which we discuss in the rest of the paper.
We thus continue by re-examining BTW SOC's foundations. We do so by reference to, and quotes from, the key original papers, in search of BTW's original claims, in order to understand the interpretations (and misinterpretations) which have been made of them. We first revisit BTW's motivation for introducing the SOC idea, which they stated most clearly in (Bak and Chen, 1989). In Section 4, Section 5, and Section 6, closely following this key 1 paper, we will recap the reasoning which led BTW to their postulate, showing how it does in fact give a relatively precise definition of SOC.
Section 7 then brings us up to date by linking the preceding discussion to the contemporary literature on SOC. In so doing we identify the necessary and sufficient conditions for SOC. Not differentiating necessary and sufficient conditions is, we believe, one source of the erroneous beliefs (sometimes found in the literature) that everything that is avalanching must be critical and self-organized, or, conversely, that everything that displays long-ranged correlations or a power law must be an instance of (self-organized) criticality.
These errors in logic, as well as a loose interpretation of BTW's core idea have helped to create the divergence of versions of SOC noted above and some of the controversy that has tended to surround SOC. In Section 8 we address some of the most prominent of these, giving our own views about some misconceptions and subsequent conflicts that have arisen. This section is an opinion piece insofar as we will try to point out and clarify certain issues, which we think have caused unnecessary problems in the past.
We then balance our discussion of controversy by noting that in several areas of science SOC has been a success story. Section 9 thus discusses how this has happened, in a context most relevant to our paper. We briefly discuss how SOC has provided a paradigm for, and thereby consolidation of, existing observations which lacked context, in solar, magnetospheric and fusion plasma physics. Finally, we offer some concluding remarks and perspectives on future research in Section 10.
Appendix A discusses, in a fairly self-contained manner, the more technical topic of scaling, intended to complement our discussion of SOC by setting the BTW worldview, SOC and its foundation in some of its broader theoretical context. It may be omitted in a first reading. In it we discuss the general ideas of scaling, which underpin a whole range of disciplines such as critical phenomena and dy-namical systems, as well as methods, like dimensional analysis (e.g. Buckingham, 1914).
Our team of authors has a somewhat unique perspective in that our previous involvement with SOC and complexity ranges from condensed matter theory (Jensen, 1998;Pruessner, 2012), where the SOC concept originated, to solar astrophysics (Crosby et al., 1993;Watkins et al., 2009), complex plasma physics both in the coupled, magnetosphere and turbulent solar wind (Chapman et al., 1998;Watkins et al., 1999;Lui et al., 2000;Chapman and Watkins, 2001;Watkins et al., 2001;Freeman and Watkins, 2002;Watkins, 2002) and in fusion reactors Dendy et al., 2007), and SOC-inspired (or informed) cross-disciplinary complexity research in the environmental sciences (Watkins and Freeman, 2008;Watkins, 2013;Graves et al., 2014). We have thus tried to cater both for readers interested in the theoretical foundations of SOC and those concerned with its applications to nature. We hope we have clarified that SOC remains a strong, relevant, scientific theory, even if it is not always "how Nature works". Some readers will, rightly, read parts of this contribution as an opinion piece; however, we have tried to support our views by quotes and references wherever possible. Readers with a background in statistical mechanics may be interested in the historical context that led to the development of SOC, in particular Secs. 4-7, but also Section 8 for the controversies surrounding SOC. Those from the plasma physics community will probably be interested in Secs. 2, 5, 7, 8 and 9 with a focus on applications in plasma physics. Those who are mostly interested in the controversies surrounding SOC will benefit particularly from reading Secs. 3-6 and 8. We obviously believe that the other sections, in particular the conclusions in Section 10, are of broad interest and certainly worth reading, but most of the sections are fairly self-contained, inviting the reader to make their own selection. We have tried to facilitate this by prefacing the most technical sections by bullet point summaries of what they contain.
solving for the dynamics of individual events is intractable. Instead, one can look to the success of the renormalization group approach (Wilson, 1971(Wilson, , 1979 in critical phenomena. Here, one needs to characterize the fundamental local interaction, and how it coarse-grains as more and more elements in the system are aggregated. Central to the structure of such a model is self-similar scaling (the system looks the same on all scales subject to a rescaling), leading to power law distributions of (event) sizes and power law (long-range) correlations as the key observable. Importantly, a broad range of different detailed, microscopic interactions, on coarse-graining, lead to the same collective behavior, thus one expects the same essential phenomenology to be ubiquitous.
The classic triumphs of the renormalization group (RG) in critical phenomena (Wilson, 1971(Wilson, , 1979 have been for systems in equilibrium. Extensions of the RG approach to non-equilibrium, either relaxing to equilibrium or staying far from it, have been developed successfully over the last 40 years or so (e.g Chang et al. (1992)), but a successful application to plasma physics remains elusive (though see for example Balescu (1997)). It is for these diverse plasma systems that SOC offers considerable attractions as a paradigm. As we will see in Section 6, SOC introduces dynamics by enforcing a separation of time scales, i.e. the build-up to instability is slow, while relaxation is fast. This fast relaxation leads to avalanche-like, bursty energy release on a broad range of scales. The dynamics of an avalanche is fundamentally multiscale, it occurs by coupling across many spatial scales in the system. Importantly, the statistics of energy release events, indeed, the dynamics, are not sensitive to the details of the instability, thus in a plasma where many instabilities and routes to instability are possible, one expects to see the same, robust emergent behavior. Indeed, one could identify a paradigm for SOC in plasmas, or perhaps more accurately, "multiscale avalanching", based on these properties alone, which are sufficient to provide a new, insightful framework for ordering the observations.

Perceptions and receptions of SOC
The interaction of BTW's papers and their many readers has led to nested 2 perceptions of SOC, as illustrated in Figure 1. We can summarize these as essentially four, in order of increasing ubiquitousness: • Self-tuned phase transitions can (and do) exist in nature -The core idea of SOC, clearly enunciated by Bak and Chen (1989), which was presented as a dynamical origin of spatio-temporal fractals in nature 3 .

Fig. 1
A schematic representation of the range of perceptions of SOC in the literature, from the most minimal at the center to the most visionary. As explained in the text, the hatched core is the proposal that a mechanism exists in nature whereby some systems tune themselves to a phase transition. This mechanism has sometimes been promoted as the primary (or even single) cause of fractals in nature (second circle). Some authors have regarded fractals and power laws as synonymous (third circle), and proposed that SOC was needed as the underlying mechanism for the latter as well, despite the many alternative explanations in many cases. The outer-most region considers contingency in nature as the signature of SOC.
• All fractals in nature are caused by SOC -A much more sweeping claim, but one which a reader could have been forgiven for inferring from the abstract of the same paper (Bak and Chen, 1989). • All power laws are caused by SOC -An even more sweeping claim, never to our knowledge made by Bak, but which many readers might easily have inferred from reading the discussion of Zipf's law in the first chapter of his book (Bak, 1996). • The contingency of nature is caused by SOC -See for example the abstract of (Bak and Paczuski, 1995), and Bak's (2000) review of Buchanan (2000).
The clear divergence between these pictures of SOC, and the fact that all of them have had at least some adherents, has had some important consequences. An ever increasing diversity and confusion about claims, proofs and evidence resulted in a muddled perception of the status of SOC; of how it explains natural phenomena; and which ones. To this day, except for computational confirmation of the core claim, there is no unambiguous, unquestioned evidence for any of the claims above, even though they inspired much research. However, it is also fair to say that experimental, but its theoretical interest is more far-reaching, because it "also serves as a paradigmatic model for a wide class of physical problems where interplay of nonlinearity and disorder is important" (Pikovsky and Shepelyansky, 2008). observational, numerical and analytical work is homing in to corroborate at least the core claim.
Whether correct or not in its generality, the first picture, the core SOC idea, was from the outset relatively tightly defined, being formulated in the language of mathematical physics and condensed matter theory. It was actively pursued and debated by these communities from the outset, and has been the subject of several book length treatments including those of two of the present authors (Jensen, 1998;Pruessner, 2012).
The second and third pictures have long been known to be wrong 4 , and yet have at the same time been widely influential. Being at the same time a target for criticism and polemic, and a source of creative misunderstanding, they provide an interesting present-day example for historians and philosophers of science of how error and miscommunication can sometimes have positive as well as negative side effects.
The fourth, and most visionary, picture essentially expressed a new paradigm for complexity science, which we may call "complexity and contingency from criticality" as opposed to contingency from low dimensional chaos. Bak put this vision clearly, late in his life, in his review of Buchanan (2000) The tool of history, certainly, is story-telling after the fact. Why is this? Why does it make no useful theoretical predictions? Why is it, in other words, that "Life is understood backwards, but must be lived forwards", as philosopher Søren Kierkegaard put it. [. . . ] Buchanan wants us to know that we live in a special time in which new ideas are beginning to make it possible to see why history is the way it is. Surprisingly, perhaps, the ideas it uses find their origin not in history, but in theoretical physics. He proposes to explain why history is and even must be punctuated by dramatic, unpredictable upheavals. He promotes a theory declaring that all past efforts to perceive cycles, progressions and understandable patterns of change in history have necessarily been doomed to failure. [. . . ] "Contingency is the affirmation of control by immediate events over destiny, the kingdom lost for want of a horseshoe nail," as biologist Stephen J. Gould has observed. And contingency is the hallmark of the critical state. (Bak, 2000, 56-57) This fourth interpretation of SOC is arguably at least as interesting as the first, though much more speculative. It is, however, of much less importance to the astrophysical plasma context of the present paper. We will thus concentrate entirely on the first, and will only discuss the second and third erroneous perceptions when we need to explain how they have confused the issue.

BTW's Stated Motivation: "a dynamical theory of the physics of fractals"
In this section we describe BTW's aims in proposing SOC, and how Bak and Chen in their 1989 Mandelbrot Festschrift paper (Bak and Chen, 1989): • Quoted the widespread evidence for spatial fractals, and power law spatial correlation functions, and echoed Kadanoff's call for a physical explanation, • Remarked on the parallel unsolved time domain problem of "1/f" noise • Proposed the bold idea that natural complex systems could self-organize to a particular kind of state that produced these effects, analogous with those seen in laboratory "critical" systems near a phase transition-hence Self Organized Criticality.
Spatial Fractals: In their 1989 paper, Bak and Chen gave what we believe to be their clearest statement of the original SOC idea. They first noted the evidence for the widespread existence of spatial fractals: The importance of Mandelbrot's discovery that fractals occur widespread in nature can hardly be exaggerated. Many things which we used to think of as messy and structureless are in fact characterized by well-defined power law spatial correlation functions. By now, we are so used to seeing fractals that we are tempted to feel that we understand them. But do we simply have to accept their existence as "God-given" without further explanation or is it possible to construct a dynamical theory of the physics of fractals? (Bak and Chen, 1989, p. 5) It is important to note that the power laws of concern to Bak and Chen were in the correlations between fluctuations in space, rather than the general question of power law size distributions in nature, a point we will return to (e.g. Section 7). BTW used power law distribution functions as proxy for power law correlations, making that link explicit at an early stage (Bak et al., 1988a, p. 369). In general, it is by no means clear that size distributions with no clear connection to spatial correlation (or avalanches), such as those of fractured frozen potatoes (Oddershede et al., 1993), the distribution of lunar crater sizes (Head 3rd et al., 2010), or the length of queues in Britain's National Health Service (and her pubs) (Smethurst and Williams, 2001;Freckleton and Sutherland, 2001) would (or should) ever have been seriously intended to be in the remit of BTW's SOC. We would argue that it remains unhelpful to try to define a notion of "SOC" that is sufficiently elastic to encompass them. Fractals in time and 1/ f noise: Bak and Chen went on to highlight the ubiquity of fractals in time: There is another ubiquitous phenomenon which has defied explanation for decades. The signal (water, electrical current, light, prices, . . . ) from a variety of sources has a power spectrum decaying with an exponent near unity at low frequencies . . . This type of behavior is known as "1/ f" noise, or flicker noise. (Bak and Chen, 1989, p. 5) The "1/ f " noise which BTW referred to was discovered by Schottky, early in the 20th century. A 1/ f power spectrum is generally regarded as the fingerprint of (temporal) correlations so strong that any future state must be considered a function of the system's entire history (Jensen, 1998, p. 9). This remains true for generalized 1/ f spectra, i.e. across a range of power law dependences of the power spectrum on f . It is important to realize that rather than "defying" explanation, it had in fact been the subject of many explanations (e.g. van der Ziel, 1950;Schick and Verveen, 1974;Weissman, 1988), but that BTW found these unsatisfying and lacking in generality. In magnetospheric physics the presence of "1/ f " spectra in geomagnetic indices and other ground-based magnetic measurements (Tsurutani et al., 1990;Weatherwax et al., 2000) was, early on, one of several key supporting pieces of evidence of SOC.
The SOC postulate: A perceived need to unify the above two aspects of fractality, and, importantly, a claimed absence of existing ways to do so, led BTW to postulate the idea of SOC. Apparently, they were guided by the observation of scaling in space and time (fractals and generalized 1/ f noise) in equilibrium and nonequilibrium critical phenomena, such as the Ising Model (Stanley, 1971;Hohenberg and Halperin, 1977). Bak and Chen put it this way: Strangely enough, just as those working on fractal phenomena in nature never seem to be interested in the temporal aspects of the phenomenon, [. . . ] those working on "1/ f" noise never bother with the spatial structure of the source of the signal. We believe that those two phenomena are often two sides of the same coin: they are the spatial and temporal manifestations of a self-organized critical state. (Bak and Chen, 1989, p. 5) Bak and Chen prefaced this (already very bold) claim in the paper with one of the most memorably terse abstracts in the history of science, wich may be called the "SOC postulate": "Fractals in nature originate from self-organized critical dynamical processes", expanded on by a comment on the first page where they said: We see fractals as snapshots of systems operating at the self-organized critical state. (Bak and Chen, 1989, p. 5) The gap between the relatively specific idea of explaining space-time fractal avalanching phenomena and therefore spatio-temporal correlations, 5 and the aspiration that many perceived to explain any fractal in space or time, or even any power law distribution, has been a perennial problem, and a key source of the controversy and misunderstandings that still surround SOC.
• They argued that spatial and temporal scaling must usually be unavoidably connected • They posited that in contrast to phase transitions (or chaos) seen at a fixed point in control parameter space there must be a more robust (and thus widespread) new kind of spatiotemporal critical behaviour which resulted from self-organization and for which their sandpile was the exemplar (i.e. SOC) • They identified conditions for SOC behavior to be seen in a system; later recast by Jensen as 'slowly driven interaction dominated and thresholded", and also highlighted the role of dissipation in maintaining such a state • And they asserted that spacetime fractals were snapshots of the SOC state.
Rather than the impossibly broad, and with hindsight unnecessary, goal of explaining all power laws in nature with one mechanism, a close rereading of (Bak and Chen, 1989, as well as Bak et al., 1987Bak et al., , 1988a, in which the SOC concept was launched) shows quite clearly that the aim of SOC was to unify dynamically evolving spatial and temporal fractals. BTW were taking as a cue Kadanoff's (1986) famous question "Fractals: Where's the Physics?", which itself had been aimed at a fractal "industry" which was experiencing its first wave of enthusiasm at that point. In a volume of papers dedicated to Mandelbrot, Bak and Chen (1989) responded equally boldly and provocatively to Kadanoff that: "Fractals in nature originate from self-organized critical dynamical processes". Beyond the immediate goal lay the even more ambitious one of accepting the challenges posed by two Nobel Laureates: Phil Anderson's (1972) celebrated essay "More is different" on complexity science, and Ken Wilson's (1979) invitation to a wider adoption of what is arguably the most powerful tool in statistical mechanics, the renormalization group.
Firstly, they argued that spatial and temporal scaling were intrinsically linked, i.e. that the scaling historically observed in time series as 1/ f noise (van der Ziel, 1950) is related to the spatial scaling that became prominent with the advent of Mandelbrot's (1983) fractals (e.g. Feder, 1988, p. 7): Actually, for those (like us) who are brought up as condensed matter physicists it is hard to believe that long-range spatial and temporal correlations can exist independently. A local signal cannot be "robust" and remain coherent over long times in the presence of any amount of noise, unless stabilized by the interactions with its environment. And a large, coherent spatial structure cannot disappear (or be created) instantly. For an illustration, think of the temporal distribution of sunshine, which must be correlated with the spatial distribution of clouds, through the dynamics of meteorology. (Bak and Chen, 1989, p. 5) It has been argued subsequently, however, that scaling in time is rather common  in non-equilibrium, and even in equilibrium dynamics, which is otherwise "a rotten place to hunt for generic scale invariance" (Grinstein, 1995, p. 262). In other words, space and time fractality need not in fact be related. Prior knowledge that one is dealing with a spatially extended, and importantly, connected, system may make such a connection more likely. The precise way to check, at least in principle, is to measure a spatiotemporal correlation function and check if one has scaling (i.e. algebraic rather than exponential dependence) in both r and t. It is interesting with hindsight that as early as 1967 Mandelbrot had realised that scaling in time, and thus "1/f" noise, need not always be attributed to the kind of stationary long range dependence seen in his own fractional Gaussian noise models. An alternative model, which was only (in his words) "conditionally" stationary, was the fractional renewal process he discussed in (Mandelbrot, 1967). It seems likely that his awareness that several fundamentally different yet plausible mechanisms for "'1/f" noise already existed would have contributed to his evident lack of enthusiasm for SOC 6 Secondly, the concept of criticality (Section 6) was invoked to explain the scaling (Section 10.2) that was seen in nature, drawing heavily on the established theory of continuous (i.e. second order) phase transitions, but contrasting it with the new feature of self-organization (see the quote of Bak and Chen, 1989, p. 5 below, "[T]here is one area of physics . . . the critical state is self-organized."). Self organization was an essential feature of the argument, in order to explain why critical behavior is apparently so common in nature. The traditional notion of criticality placed it firmly at a singular point in parameter space, which had to be accessed by ultra-fine tuning, such as careful adjustment of the temperature in a zero-gravity environment (Lipa et al., 1996). 7 In contrast, self-organized critical systems would be dynamically attracted to a state where they display scaling, i.e. long-range correlations in time and space dominate and so bring about a new, effective interaction and global features very different from the microscopic physics: More is indeed different.
As their conclusion, as mentioned above, they proposed "the SOC postulate": The explanation is that open, extended, dissipative dynamical systems may go automatically to the critical state as long as they are driven slowly: the critical state is self-organized. We see fractals as snapshots of systems operating at the self-organized critical state. (Bak and Chen, 1989, p. 5) The first sentence refers to features of SOC systems, which have subsequently been summarized by Jensen as "slowly driven interaction dominated threshold [(SDIDT)] 6 In an interview in 1998 with Bernard Sapoval, archived by the Web of Stories project (http:// webofstories.com), he remarked that: "... criticality goes beyond what I had in mind. In fact, I think perhaps it goes beyond what is necessary. I have not made up my mind on self-organized criticality, because the characteristic of the question of magnets is that there is a parameter like temperature. At a certain critical temperature very special things happen. The characteristic of phenomena like prices or like turbulence, there's no parameter. Therefore to embed a prime [sic] without a parameter in one which has a parameter, and then argue that this parameter somehow arranges to take its own value is presupposing something that is beyond reality. I mean there are no non-critical situations. So I have not made up my mind about the power of this metaphor. The idea that dependence can be global, that variance can be infinite, and in fact that everything that has been taken as finite without any question in physics or in statistics can, in fact be divergent or zero [...] is something that did not depend upon any broader conjecture about the causes of these phenomena. It comes out of efforts to describe them and has been made unavoidable by those efforts." [Our italics] 7 In the language of dynamical systems and the renormalisation group, critical phenomena as observed at phase transitions (Domb et al., 1972(Domb et al., -2001 are characterized by a fixed point that is repulsive in several directions and therefore accessible only from a very narrow basin of attraction. systems" (Jensen, 1998, p. 126). Bak and Chen stressed openness as a required system property because at a stationary state the flux of otherwise conserved particles towards dissipative boundaries was perceived early on as a mechanism by which fluctuations and correlations are communicated throughout the system: Note that Eq. (3.2) conserves ∑ n z n except at the boundary, so that any "excess z" must be transported to the boundary for global relaxation to occur. (Bak et al., 1988a, p. 368) As a result, "The boundary cannot be scaled out in the limit of large system sizes as is usually done in statistical physics" (Paczuski and Bassler, 2000).
As well as open boundaries which do not disappear in the large system limit, Bak and Chen also emphasized dissipation, which should be understood from a dynamical systems perspective. In the presence of dissipation, dynamical systems explore a greater amount of phase space, than they would if subject to the constraint of energy conservation. When the statement about "dissipative dynamical systems" above was written, in (Bak and Chen, 1989), despite the comment on the flux of a conserved quantity, conservation in the sandpile dynamics had not yet received much attention and the focus still lay with the apparent lack of conservation when sand grains slip down a hill thereby reducing potential energy. Hwa and Kardar (1989) and Grinstein et al. (1990) put particle conservation on the map, the latter demonstrating that conserved dynamics in conjunction with non-conserved noise (or with conserved noise and spatial anisotropy) will generically produce scale invariance.
An interesting distinction between BTW's SOC model and classic forward cascade descriptions of fluid turbulence such as Kolmogorov's 1941 model is thus that dissipation in the former takes place at the boundaries and thus on large scales, while in the latter case it is the smallest  In Section 7 we will discuss the sufficient and necessary conditions for SOC. It remains now to comment on the second sentence in the quote above from Bak and Chen, "We see fractals as snapshots of systems operating at the self-organized critical state." One could quite legitimately read this as "we see all fractals as . . . " rather than "we see such fractals as . . . ". The former reading is fully in line with the concise abstract of the paper: "Fractals in nature originate from self-organized critical dynamical processes." (Bak and Chen, 1989) but nonetheless it is unlikely that BTW really believed that all fractals needed SOC to explain them.
While either version of the claim is bold, it is certainly correct that fractal-like structures in time and space are exactly what characterizes critical systems, so that a claim that (some) naturally occurring fractals in "open, extended, dissipative dynamical systems" are self-organized, is in fact identical to a claim that open, extended, dissipative dynamical systems can develop into a critical state.

Criticality and minimal stability
Considerable confusion has arisen over the years from the several meanings of the word "critical" in the phrase "Self-Organized Criticality". The word "critical" has a Fig. 2 The one-dimensional BTW Model on a lattice of size L = 5 (for illustrative purposes, this system is ridiculously small, much bigger systems are normally studied in SOC). Particles which are about to move are shown hatched, particles which are about to appear somewhere are shown in gray. The current configuration is h = {5, 5, 4, 2, 2} and h = {6, 5, 3, 3, 1} after the update indicated. One gray particle is going to be added by the external driving on site i = 1, which takes place only if no toppling occurs somewhere in the systems, such as the ones on site 5 or from site 3 to 4. The slope exceeds the threshold at two sites, i = 3 and i = 5. When the latter topples, one particle will be lost by dissipation at the boundary, as if h 6 = 0 permanently. very clear technical meaning in statistical mechanics and the theory of phase transitions. That this was indeed the intended meaning in Bak et al.'s newly coined term "Self-Organized Criticality", is clear from rereading their 1987 and 1988a papers. Unfortunately, in these same papers the word "critical" was occasionally also used in a more colloquial sense of a threshold.
In this section we clarify and distinguish three distinct meanings of the word "critical": • critical spatiotemporal correlations, such as those seen at phase transitions • critical thresholds, and • the value of a global (control) parameter at the critical point. and illustrate them by use of BTW's famous sandpile model.

The BTW Model
In order to fully appreciate the distinction between "critical" in the technical and in the loose sense, it is instructive to introduce the famous 8 BTW sandpile model, illustrated in Figure 2. On a one-dimensional grid of sites i = 1, 2, . . . , L it is defined as follows: Each site i carries a number h i of grains. If the slope h i − h i+1 at site i exceeds a threshold, h i − h i+1 > 1, then one grain is moved from i to i + 1, so that h i → h i − 1 and h i+1 → h i+1 + 1, thereby decreasing the slope at i and increasing the slope at up-and down-stream sites and thus triggering further updates. The totality of these updates constitutes an avalanche, which is triggered by adding a particle at a randomly chosen site i (known as the external drive), so that h i → h i + 1, and carries on until none of the sites exceeds the threshold any more. Only then the driving resumes. In Figure 2 the driving takes place at site i = 1 and a further toppling takes place at site 3 with h 3 = 4 to site 4 with h 4 = 2 prior to the update. The (virtual) edge site L + 1 carries a stack of height h L+1 = 0 by definition and is never updated, i.e. particles are dissipated here, as indicated in Figure 2.
The sandpile model in one dimension has some very distinctive features and its (very simple) behavior then differs significantly from that seen in dimensions greater than unity. In that case, a local, scalar slope (rather than a gradient) is introduced. In two dimensions sites are labeled (i, j) with i, j ∈ {1, . . ., L} and carry a local slope z (i, j) . If that exceeds a threshold, say 3, then z (i, j) is reduced by 4, i.e. z (i, j) → z (i, j) − 4, and the slope at all four nearest neighbors increased by one unit, i.e. z (i±1, j±1) → z (i±1, j±1) + 1. Boundary conditions are such that (virtual) sites outside the lattice are not updated, i.e. slope units are being lost.
The density of these slope units, ζ = ∑ i j z (i, j) /L 2 may be thought of as a control parameter which is generated by the pile rather than externally imposed: If it is big, then avalanches can be expected to be large -some sites surely exceed the threshold if ζ does. If ζ is small, avalanches will be small as well. The external drive increases ζ , at least temporarily, whereas dissipation at boundaries decreases it. Because large avalanches promote dissipation, they reduce the control parameter, whereas small ones may leave it unchanged. In other words, large avalanches can be expected 9 to occur at comparatively large ζ , typically reducing ζ , whereas small avalanches occur at low ζ . This feedback has been suggested to be at the heart of SOC Dickman et al., 1998). Because ζ is a global average, its fluctuations will decrease with increasing system size L, eventually "pinching" it at its mean 10 .
exhibited algebraic behaviour at criticality. We would then have had the analogy: ζ = magnetisation, z = local spin, and the external drive = temperature. Instead, in our discussion above, and for example also in (Peters and Pruessner, 2009), ζ is seen as the control parameter that drives an activity (topplings). The density of the activity of topplings is then the order parameter of an absorbing state phase transition

The meaning of "Criticality"
The behavior of the sandpile model is reminiscent of the behavior of a system undergoing a continuous phase transition (e.g. Stanley, 1971;Yeomans, 1994;Christensen and Moloney, 2005). Phase transitions have been one of the centers of attention in statistical mechanics for well over one hundred years. They normally occur in a system as some control parameter, such as the temperature, is changed.
Critical spatiotemporal correlations: One particular class of phase transitions, so called continuous or second-order phase transition, have the peculiar feature that at the critical point, that is for some special value of the control parameter, correlations become long ranged (follow a power law) and, equivalently, fluctuations occur on all length scales, i.e. there is no characteristic size and the size distribution of the fluctuations displays a power law dependence with a non-trivial exponent. Moreover, an observable indicates the onset of long-range order, whose presence distinguishes two different phases. Traditionally, that observable goes by the name of the "order parameter". It is a suitably but not uniquely defined quantity that vanishes in one phase (the disordered or high-temperature phase) and is finite in the other (the ordered or low-temperature phase). The susceptibility, which measures fluctuations and equivalently (by the linear response theorem) the response of the order parameter to a small external perturbation, diverges with the system size. This is known as critical behavior, a critical phenomenon or just criticality.
The term self-organized criticality refers to exactly that last, technical usage of criticality. Bak and Chen explicitly referred to the long range spatial correlations: [T]here is one area of physics where the relation between spatial and temporal power law behavior is well established. At the critical point for continuous phase transitions, the correlation function for the order parameter decays spatially as r 2−d−η and temporally as t −d/z . [11] But in order to arrive at the critical point, one has to fine-tune an external control parameter such as the temperature or the pressure, in contrast to the phenomena above which occur universally without any fine-tuning. The explanation is that open, extended, dissipative dynamical systems may go automatically to the critical state as long as they are driven slowly: the critical state is self-organized. (Bak and Chen, 1989, p. 5) Bak, Tang and Wiesenfeld also explain what makes criticality so relevant and so attractive: At the critical point there is a distribution of clusters of all sizes; local perturbations will therefore propagate over all length scales, leading to fluctuation lifetimes over all time scales. A perturbation can lead to anything from a shift of a single pendulum to an avalanche, depending on where the perturbation is applied. The lack of a characteristic length leads directly to a lack of a characteristic time for the resulting fluctuations. (Bak et al., 1987, p. 382) Criticality in SOC however is not reached by setting a control parameter to a "critical" value: The criticality in our theory is fundamentally different from the critical point at phase transitions in equilibrium statistical mechanics which can be reached only by tuning of a parameter, for instance the temperature. The critical point in the dynamical systems studied here is an attractor reached by starting far from equilibrium: The scaling properties of the attractor are insensitive to the parameters of the model. This robustness is essential in our explaining that no fine tuning is necessary to generate 1/ f noise (and fractal structures) in nature. (Bak et al., 1987, p. 381) Critical thresholds: Unfortunately a second, looser meaning of "critical" has led to confusion. It refers to the threshold that is frequently thought to govern the microscopic dynamics of SOC systems, and can be found in the same publications as those quoted above.
This will cause the force on a nearest-neighbor pendulum to exceed the critical value and the perturbation will propagate by a domino effect until it hits the end of the array. (Bak et al., 1987, p. 382) Here "critical value" refers to the threshold beyond which activity sets in. We would discourage this usage of "critical" and instead urge the use of the alternative, "threshold value". We suggest to refrain from combining it with "critical", as in "critical threshold". Critical global control parameters: In a subtle variation of the first meaning of "critical" mentioned above, a third, technical meaning refers to a global control parameter taking a critical value: If the slope is too large, the pile is far from equilibrium, and the pile will collapse until the average slope reaches a critical value where the system is barely stable with respect to small perturbations. (Bak et al., 1987, p. 382) Because it was hitherto unclear whether there were generally order and control parameters in SOC and, if so, what they were, one could possibly interpret this quote also as referring to a global order parameter which reaches the value characteristic of criticality.
In traditional critical phenomena, the global control parameter would be the critical temperature, or a critical probability etc., but here it is the average slope, which seems to link also to the second meaning, because it is the average over some local dynamical feature. In contrast to the first meaning, "critical" in the quote above refers to the value some parameter attains so that the system maintains criticality, e.g. divergent susceptibility. "Self-Organized Criticality" would then refer to the self-organization of the presumed control parameter to its critical value, rather than the self-organization of the system to display criticality. The difference between the two is obviously subtle, but very important. The former interpretation emphasizes the existence of a critical point and the self-organisation of a control parameter to that value, whereas the latter focuses on the appearance of the system as critical. Although we believe that the name SOC refers to the latter (the system displaying criticality), in Section 7 we briefly discuss BTW arguments that a critical point exists and that the system's dynamics drives that control parameter towards that value.

Minimal stability
Further confusion has arisen from the usage of the term "minimally stable", alluding to chaotic behavior which was being explored in the literature under the headline "edge of chaos" (e.g. Langton, 1990;Kauffman and Johnsen, 1991;Ray and Jan, 1994;Melby et al., 2000).
Our picture of 1/ f spectra is that it reflects the dynamics of a self-organized critical state of minimally stable clusters of all length scales, which in turn generates fluctuations on all time scales. (Bak et al., 1987, p. 384) In fact, SOC was introduced using the terminology of "minimally stable states" , which lose stability by even the tiniest perturbation anywhere. The language and the basic concept draw on the theory of dynamical systems. This seems to be the obvious interpretation of "minimally stable", namely a state where the smallest perturbation leads to a system-wide relaxation. That is in fact the case in the one-dimensional sandpile model, which develops into a state where (almost) all sites have a slope corresponding to the threshold value. However, the one-dimensional sandpile is exceptional in that respect: [W]e consider for pedagogical reasons an example in one spatial dimension. In this case the spatial degrees of freedom "decouple" and the system ends up in the least stable metastable state. This minimally stable state is a trivial critical state with no spatial patterns and uninteresting temporal behavior. (Bak et al., 1988a, p. 365) In higher dimensions, a small perturbation may lead to a response by the system at any scale. In contrast to "least stable metastable state", the term "minimally stable" in the context of SOC refers to the possibility of system-wide events. Bak et al. (1987) illustrated that very clearly in their Fig. 1, where every site is shaded that takes part in a particular avalanche. Some sites among these clusters surely have a slope below the threshold, so even though labeled "minimally stable" the system shown is not in a state where any charge anywhere would result in a system-wide avalanche or, in fact, necessarily any avalanche at all. In fact, later studies make it abundantly clear that the average value of the dynamical variable (the local degree of freedom subject to interaction, see below) in SOC systems is normally well away from the threshold. For example, in two dimensions, the sandpile model has been conjectured (Grassberger quoted by Dhar, 2006, finally confirmed analytically by Caracciolo and Sportiello, 2012) to have average height of 17/8 = 2.125, well below the threshold of 3, and the Abelian Manna Model (Manna, 1991;Dhar, 1999b) with threshold 1 in one dimension has average height 0.9488(5) (Dickman et al., 2001), expected to drop to 1/2 with increasing dimension (Huynh et al., 2011).
In summary, the system's "critical" features which Bak et al. claimed to have been self-organized are long-ranged correlations and divergent susceptibility to external perturbations. SOC systems organize themselves to a state where they look very much like those at a critical point, as if they were undergoing a phase transition. What BTW did not mean, and did not imply, is that the system organizes itself into a state where every local degree of freedom is close to some threshold. This happens to be the case for the one-dimensional sandpile model, but this should be regarded as a coincidence, not least because the one-dimensional sandpile model shows no interesting features otherwise. Bak et al. left it open whether the apparent control parameter reaches some critical value. In fact, a successful theory of SOC suggests (Dickman et al., 2001) that the control parameter fluctuates about its critical value.

The necessary and sufficient conditions for SOC
The seperation of cause and effect has long been problematic in much of the debate surrounding SOC, so we now set out arguments for, • three necessary features that a system needs to exhibit in order to qualify as SOC • and three sufficient ingredients comprising a mechanism (SDIDT) that produces SOC and we attempt to decide if SOC and SDIDT are synonymous.

SOC's "phenotype": The necessary conditions to observe it
As explained above, in Section 6, BTW regarded SOC as a critical phenomenon in the traditional sense of statistical mechanics, i.e. a system displaying non-trivial scaling (scaling that deviates from what is generated by simple dimensional analysis, see Section 10.2). While this can be seen in many different observables, and in fact, in SOC is often observed in integrated, global quantities, such as avalanche sizes and durations, scaling should manifest itself in particular through the presence of long-ranged spatio-temporal correlations. The term "long-ranged" alludes to the fact that, again, these correlations should display power law scaling and not decay like, say, an exponential. Demanding direct evidence for the scaling of spatiotemporal correlations is a technical challenge, as explained in Section 10.2.
One key aspect of SOC, however, deviates strongly from traditional (tuned) critical phenomena (see the quote above in Section 6.2, "[T]here is one area . . . continuous phase transitions . . . without any fine-tuning . . . critical state is selforganized." from (Bak and Chen, 1989, p. 5)), in that these always require tuning to a critical point, i.e. a precise setting of one or more parameters to a specific finite value. 12 The dynamics of SOC systems is supposed to drive them to the critical point without the need of such external "tweaking" of a control parameter. Prior to the advent of SOC, some systems were known, in particular growth phenomena (e.g. the KPZ equation, Kardar et al., 1986), that did display non-trivial scaling without external tuning of a control parameter (and certainly in the presence of competing length scales). BTW, however, argued that although a critical point exists in SOC systems, the dynamics itself drives the system towards a critical point, which otherwise would only be reached by external tuning of a control parameter: The critical point in the dynamical systems studied here is an attractor reached by starting far from equilibrium: The scaling properties of the attractor are insensitive to the parameters of the model. This robustness is essential in our explaining that no fine tuning is necessary to generate 1/ f noise (and fractal structures) in nature. [. . . ] In a sense, the dynamically selected configuration is similar to the critical point at a percolation transition where the structure stops carrying current over infinite distances, or at a second-order phase transition where the magnetization clusters stop communicating. (Bak et al., 1987, p. 381) A suitable mechanism of self-organization to a critical point, in some ways reminiscent of earlier suggestions by Tang and Bak (1988a,b), was put forward and made explicit by Vespignani et al. (1998, also Dickman et al., 1998. Its verification remains subject to ongoing research. Having made the case for "truly self-organized, truly critical" SOC systems, it remains to remark that even so, every finite system still has an inherent scale, namely the system size itself. This is obviously also the case in traditional critical phenomena, but there the control parameter can be tuned away from the critical point. In such tuned systems the phenomena observed and the measurements taken can therefore approximate the infinite system (or "thermodynamic limit"), at least in principle, arbitrarily well, by increasing the system size ever more, as the control parameter is tuned closer and closer to the critical value. This is not the case in SOC systems, which are (supposed to be) located right at the critical point with the result that all observables that display any form of scaling, or which are expected to be divergent in the thermodynamic limit, will depend on the system size. This dependence is called finite size scaling, a well known and understood aspect of traditional critical phenomena (Barber, 1983, see also the discussion in Appendix A).
The key-features of a system in the SOC state, their phenotype, can thus be summarized as 1. Non-trivial scaling (finite size scaling; no dependence on a control parameter). 2. Spatio-temporal power law correlations. 3. Apparent self tuning to the critical point (of a possibly identified, underlying continuous order phase transition).
where the first and the second item may be seen as aspects of the same feature: criticality. Apart from the many proposed SOC systems, one candidate-system that seems to fulfill (some of) the above features is invasion percolation (Wilkinson and Willemsen, 1983) which predated SOC and was likened to it early on (Grassberger and Manna, 1990). It clearly displays non-trivial scaling (namely that of percolation), it clearly displays spatial correlations, and (with suitable definition of the dynamics) also correlations in time. In fact, much like traditional SOC models, its burst-like evolution may be regarded as one possible form of intermittency -an aspect of many complex systems, to be further discussed below. However, such avalanches do not appear in cycles of charge and relaxation, but are part of an ever increasing region invaded by a cluster (Sornette et al., 1995). In other words, invasion percolation does not develop into a stationary state (in the statistical sense) 13 . What is more, even when invasion percolation displays the scaling of ordinary percolation, there is no suggestion of any self-tuning taking place. Rather, invasion percolation sits right at the critical point by definition. In that sense, invasion percolation resembles Brownian motion, which offers a rich variety of power law correlated features (although not strictly non-trivial), without any apparent self-tuning (Sornette, 2006;Milovanov, 2013).

SOC's "genotype": The sufficient conditions to generate it
Above we have relied heavily on the original, early work by BTW to define SOC phenomenologically. The features listed above are in fact all the necessary conditions for SOC -with "necessary" taken strictly in the logical sense (they define SOC: if and only if all of them are observed, one is faced with SOC). The sufficient conditions for SOC then point to a cause of SOC, asking for the system's key ingredients in order to produce those SOC characteristics, and are in a sense its genotype. A lot of the research into SOC centers precisely around these sufficient conditions; the early hunt for different members of the BTW universality class (e.g. Zhang, 1989;Manna, 1991) was clearly motivated by this question.
The most obvious key-ingredient of any SOC model is the presence of nonlinearities in the interaction, so that the response is not a simple, linear function of the size of the external perturbation. 14 In most SOC models the non-linearity is realized as a threshold in the interaction, i.e. activity can spread only when some local dynamical variable exceeds a threshold. We note that the algorithms for BTW's sandpile model and the Edwards-Wilkinson (EW, (Edwards and Wilkinson, 1982)) model of deposition are the same except that BTW SOC has thresholded diffusion whereas EW has simple diffusion.
When the threshold is overcome, interactions between neighboring dynamical variables take place (often referred to as "topplings") and as a result bursts of activity occur in the form of avalanches, which can involve the system in its entirety. These avalanches spread because the interaction in a toppling can induce a neighboring local dynamical variable to overcome a threshold, even when prior to the interaction it was not very close to being "triggered". In the presence of thresholds, avalanching is thus naturally expected. Of the sufficient conditions listed below, avalanching is therefore the most likely candidate to be superfluous, because it is implied by the other conditions, and in fact may be listed with the necessary ones, in the sense that it is part of the definition of SOC beyond the immediate meaning of just these three letters.
Surveying the wealth of systems (supposedly) displaying SOC, a very strong candidate for our final key-ingredient is the separation of the time scales of driving and relaxation, which is implied already in the original definition of the BTW model . SOC systems are slowly driven, so that the characteristic time scale of the driving does not interfere with the internal, fast time scale of the relaxation. In computer models and generally when the relaxation occurs in bursts or avalanches, the separation of time scales can be completed. 15 If one insists on intermittent relaxation in the form of avalanches, then it is obvious that the driving must be sufficiently slow as not to disturb the avalanche while it is running Chapman et al., 2009), otherwise continuous activity will result (Corral and Paczuski, 1999) and individual avalanches are no longer discernible without the use of, say, some arbitrary threshold.
In particular in the earlier days of SOC, some debate Bröker and Grassberger, 1999) evolved around the question whether the demand of a separation of time scales amounts to a form of tuning or global supervision (by a "babysitter", Dickman et al., 2000, or a "farmer" Bröker andGrassberger, 1999), thereby rendering SOC a tuned type of criticality after all.
We thus summarize the sufficient conditions so far (the "genotype", to draw that parallel) as 4. Non-linear interaction (required by 1), normally in the form of thresholds. 5. Avalanching (intermittency, expected in the presence of thresholds and slow driving). 6. Separation of time scales (obvious requirement to sustain distinct avalanches).
We hypothesise that every system that simultaneously fulfils these three conditions will display SOC according to its definition in 1-3, and vice versa.
Together with the "phenotypical" conditions 1-3 listed above one might summarize all six of them by defining SOC as "Slowly driven, avalanching (intermittent) systems with non-linear interactions, that display non-trivial power law correlations (cutoff by the system size) as known from ordinary critical phenomena, but with internal, self-organized, rather than external tuning of a control parameter (to a nontrivial value)."

Must SOC and SDIDT always be the same ?
We believe that the most promising instances of SOC, in particular computer and theoretical models, fulfill these criteria. If they are really sufficient (but see below) (a) SDIDT is a subset of IDT and of SOC.
(b) Maybe all of SOC belongs to SDIDT.
(c) Maybe SOC is a small subset of SDIDT. Fig. 3 Slowly driven, interaction dominated threshold (SDIDT) systems are a subset of interaction dominated threshold (IDT) systems. Interpreting slow drive as implying intermittency, SDIDT coincides (roughly) with the list of sufficient conditions, 4-6, for SOC. All SDIDT may therefore be expected to display SOC, as indicated by both Figure 1.3(a) and Figure 1.3(b). In the latter Venn diagram, however, all SOC belongs to SDIDT, i.e. SDIDT and therefore condition 4-6 are not only sufficient, they are also necessary. It may well be, however, that conditions 4-6 are not complete, i.e. they are only a subset of the sufficient conditions. In that case, Figure 1.3(c) applies, where SOC is a subset of the larger class of SDIDT. and implied by the necessary conditions, 16 i.e. they are complete and not too narrow, then fulfilling conditions 1-3 implies fulfilling conditions 4-6 and vice versa.
Conditions 4-6 may also be interpreted as a paraphrase of "slowly driven, interaction dominated threshold (SDIDT) systems" (see Section 5), with intermittency implied by the slow drive. That SDIDT is sufficient for the occurrence of SOC was conjectured earlier (Jensen, 1998, p. 126) and was well received in the plasma science community, Section 9, although sometimes with much reduced emphasis on slow drive. Such systems are shown as IDT in the Venn diagrams Figure 3. There are several models considered for their supposed SOC behavior, which either lack any obvious driving or whose driving is subject to conditions beyond just being slow (Wilkinson and Willemsen, 1983;Moßner et al., 1992;Jensen, 2002b, 2004;Bonachela and Muñoz, 2009). To this day, SOC models are studied with finite drive (Corral and Paczuski, 1999), i.e. as IDT models in their own right. The Venn diagrams Figure 3 exclude the possibility of all IDT being SOC, as the only overlap of SOC and IDT amounts to SDIDT, indicating that slow drive may not only be a sufficient condition for SOC, but also a necessary one, which sits well with the notion that finite driving introduces a finite scale and therefore possibly a cutoff.
While the relation between SDIDT and IDT is an obvious one, the relation between SDIDT and SOC is less clear. According to the list above, points 4-6, SDIDT systems should display SOC (as indicated in Fig 1.3(a) and 1.3(b)) but that SOC phenomena are restricted to SDIDT is a much stronger statement, as illustrated in Fig 1.3(b). If we identify criteria 4-6 with SDIDT and they are not too narrow (in that sense truly minimal), then there is no SOC outside SDIDT, i.e. Fig 1.3(b) and not Fig 1.3(a) is the correct representation of the status quo of SOC.
Given, however, that some supposed SOC models, like the Forest Fire Model lack scaling (Pruessner and Jensen, 2002a), a more realistic concern is that conditions 4-6 are incomplete, so only some particular SDIDT systems display SOC, as shown in Fig 1.3(c). It remains one of the most important questions in SOC to complete the list of sufficient conditions without making them too narrow.

Why then is SOC controversial?
In response to the points we have made above, a natural question may already be occurring to the reader: "If, as you claim, SOC was originally relatively clearly defined, and if one can now define necessary and sufficient conditions for it, why was it (and is still) controversial"? This is a good question, and there are many reasons, of which we identify the following as particularly important: • Uncertainty and miscommunication about what the essential SOC claim in fact was; confusion between the phenomena to be explained and the mechanism proposed as their explanation; and, as a consequence, confusion about what to look for as experimental "proof", and what to look for as potential application of the theory. • SOC models and supposed occurrence of SOC in nature are easy to test for badly and difficult to test for well. To this day, outside the field of tuned phase transitions, really solid empirical evidence for scale invariance in nature by lab experiment or analysis of observational data is actually quite limited even when its "ubiquity", as recognized by Mandelbrot and Bak, is the very motivation for the field. Debates thus continue over the ubiquity of fractals, particularly spatiotemporal fractals and avalanches (e.g. Avnir et al., 1998). • Many of the SOC models are highly idealized and do not even attempt to capture the basic interactions of a natural system. Rather than the caricature of magnetism in the Ising model, or the way in which a shell model encapsulates the symmetries of turbulence, they often consist of a set of rules in the vein of a cellular automaton, designed to display spatio-temporal scale invariance. However, at closer inspection many fail to display the desired features. • "Human Factors": Citation and priority of Mandelbrot and others ruffled some feathers, while SOC may have been a distraction from other important and prior work on spatio-temporal fractality. • Confusion about the deterministic nature and predictability in some SOC models, and the natural phenomena they were supposed to apply to. On the one hand, the prime example of SOC was the sandpile model which evolves according to deterministic rules, on the other hand, Bak and Tang (1989, p. 15636) concluded (about earthquakes) that "there is virtually no hope for ever making specific predictions". • There are alternative explanations even within other theoretical work on critical phenomena for "dirty power laws" and "fat tails", such as "plain old criticality" (Perković et al., 1995) or "sweeping of an instability" (Sornette, 1994).
In the following, we address these points in further details.

Confusion
As discussed in Section 3 and Figure 1.1, the relatively clear core claim of SOC (of the possibility of self-tuned phase transitions in nature) was sometimes coupled with a perception that SOC aims to explain all fractals or even all power laws. That claim was not made initially, though some proponents of SOC later nourished that belief even in their popular writing. In Bak's own "How Nature Works" (Bak, 1996) power laws of possible relevance to SOC (such as solar flare X rays) and others almost certainly irrelevant to SOC (such as Zipf's law of word length) were mentioned side by side. This had the unfortunate consequence of obscuring the fact that the power law distributions in avalanche sizes and durations were only proxies for the power law correlation functions that BTW described as a crucial aspect of the unification of spatio-temporal fractals that they were seeking , also Section 4 and Section 7). This already serious problem was compounded by a confusion of the proposed explanation (SOC and its dynamics) with the explanandum (the thing to be explained, namely [ubiquitous] spatio-temporal fractals). For as soon as the dynamics of SOC processes is accepted as the universal explanation for the phenomenon of spatio-temporal fractals, every observation of such fractals becomes an instance of "SOC at work". Worse, observation of scaling may be seen as evidence for the validity of SOC as an explanation, and (by association or a leap of faith) of the boldest of all SOC claims (Figure 1), that the contingency of nature derives from SOC.
At first, measuring event size distributions may have been a necessary evil, as correlations are so much harder to acquire and analyze (see Appendix A), at least in numerical simulations and from observational data. Over the years, in the absence of an easy way of measuring correlations, the literature as a whole moved towards the notion of power law distributions as a replacement (rather than a proxy, as discussed in Section 4) for power law correlations. To a large extent the distinction was forgotten and the significance of the latter associated with the former. Statistical mechanics provides a systematic link between the two in the form of a sum-rule, akin to the one relating susceptibility and correlation functions in, say, the Ising Model (Stanley, 1971, p. 120, andin SOC, Pruessner, 2012, Sec. 8.5.4.1). Yet, the relation between the two is strained by technicalities and it is often far from obvious which correlation function is expected to display (power law) scaling if an observable representing a spatio-temporal integral (such as the avalanche size as the activity integrated in time and space) follows a power law distribution. Power law distributions are therefore often a proxy for something unknown.
Nevertheless, a significant number of papers in the wider literature accepted the notion that every observation of a power law readily signals the presence of long ranged (i.e. power law) spatio-temporal correlations. In some cases, power law distributions are "trivial" in that they arise without non-trivial interaction and correla-tions (see Appendix A). For example, some directed sandpile models display power law scaling in the avalanche size distribution, but no spatial correlations whatsoever (Pruessner, 2004a). It is fair to assume that the proponents of SOC were well aware of the difference between "power law distributions" on the one hand and "power law correlations" on the other. It is probably also fair to assume that they were fully aware of the core claim of SOC being an hypothesis subject to an ongoing investigation.
Above we have tried to draw a line between power laws observed in event size distributions and power laws observed in correlations. Not every power law event size distribution is indicative of power law correlations. Traditionally, at least in statistical mechanics, the emphasis has been on the latter, as power law correlations indicate long ranged correlations, which normally (if exponents are not too large) signal cooperative phenomena. They are interesting, because the whole is then more than its parts, i.e. the system cannot then be completely described by decomposing it into smaller compartments or components.
The distinction between "spatio-temporal (power law) correlations" and "spatiotemporal fractals" is even more blurred: Clearly, fractal spatio-temporal structures imply non-trivial, long-ranged (i.e. power law) spatio-temporal correlations. The converse connection, however, is quite loose, as it is far from clear as to what to expect to be fractal in the presence of power law correlations. Is it justified to assume that fractal features of less tangible objects (such as spatio-temporal activity patterns) indicate an underlying fractal structure of the constituent parts of the system?

Ubiquity, universality, generality
The argument ". . . it is hard to believe that long-range spatial and temporal correlations can exist independently . . . [a]nd a large, coherent spatial structure cannot disappear (or be created) instantly." (Bak and Chen, 1989, p. 5, as quoted in Section 5), is a reasonable one and won many supporters. It rests on the realization that surely long-range correlations cannot be confined to a particular dimension, rather they feed through to all space and time dimensions. That perception, however, has long been revised: In equilibrium critical phenomena, certain dynamics or algorithms (Swendsen and Wang, 1987;Wolff, 1989) evolve spatial fractals essentially with little or no temporal correlations. Vice versa, temporal correlations do not necessitate spatial correlations, as illustrated by, say, directed sandpile models (Dhar and Ramaswamy, 1989;Pruessner, 2004a), which carry no spatial correlations and yet display memory. In fact, Grinstein (1995, p. 267) called temporal correlations "in the presence of a local conservation law [. . . ] difficult to avoid". In other words, if SOC is expected in the simultaneous presence of spatial and temporal correlations, then the existence of both has to be ascertained, because one does not imply the other.
As mentioned several times throughout this piece, it is notoriously difficult to measure long range spatio-temporal correlations in situ or even numerically (but see below) and therefore many authors resorted to measuring spatio-temporal integrals of observables, such as avalanches sizes, durations and areas. The scaling of the distribution of these event sizes, say the avalanche size, can be related to the scaling of a correlation function, say the activity propagator, measuring the spreading of "activity" in the system some time after a triggering event at some "seeding point" in the system (Pruessner, 2015). Alternatively, the avalanche size can be expressed in terms of the spatially averaged activity (Lübeck, 2004;Pruessner, 2012, Sec. 9.3.4 andMcAteer et al., 2014). Less directly, the width of the interfacial mapping (Paczuski and Boettcher, 1996;Pruessner, 2003) of the Oslo Model , which is related to the scaling of the probability density function of the avalanche size, scales exactly like the height-height correlation function of that interface (Barabási and Stanley, 1995). It is probably fair to say, that it is difficult enough to extract from experiments and observations any scaling or fractality, which are therefore seen as "good enough" substitute or, more accurately, symptoms of the long range spatio-temporal correlation supposedly causing them.
Where fractals and scaling are suspected in natural phenomena, observational support is often very limited (Avnir et al., 1998, e.g.), both in terms of length and time scales spanned by the data as well as its robustness. Broad distributions are frequently found, but there are few phenomena, which offer sufficiently detailed and broad data to support power law scaling beyond reasonable doubt. It is difficult to reconcile the efforts that have been spent on experiments, data gathering and analysis with the claim that scaling or just power laws are ubiquitous in nature. One may therefore ask, rather provocatively: Is there really a (ubiquitous) problem to solve?
Unless one accepts the claim that SOC is the basis of scaling in nature, SOC itself (not just scaling) as defined in Section 7 is difficult to identify in a natural phenomenon or experiment directly. If anything, SOC has been offered as an explanation for certain scaling to appear spontaneously. At the theoretical end, none of even the computer models which are widely accepted as displaying all the hallmarks of SOC (see Section 8.3) has been solved or even only systematically approximated. In fact, there is not even a mean field theory that makes any quantitative reference to SOC taking place in spatially extended systems with some form of boundary at finite distances.
In summary, SOC was conceived as an explanation of a ubiquitous natural phenomenon, but it turns out that observational or experimental evidence is very difficult to come by. Hard evidence for SOC is mostly due to numerical modeling. To this day, there is no complete theory of SOC and it remains unclear why a phenomenon, that should be observable under generic conditions is so rarely seen.
In that particular respect, SOC has shared the fate of the "directed percolation universality class" (e.g. Hinrichsen, 2000a;Ódor, 2004), which, although widely accepted to apply to an enormously large class of phenomena (Janssen, 1981;Grassberger, 1982), ranging from catalytic chemical reactions or to epidemic spreading, still has very little experimental and observational support (Hinrichsen, 2000b, how-ever see the laboratory experiments beginning with Takeuchi et al., 2007 and accompanying news coverage e.g. Hinrichsen, 2009, and the intriguing observational claim of Wanliss and Uritsky, 2010).

Paradigmatic versus good models
SOC has been introduced and motivated by the sandpile model, which is given in the form of a set of updating rules as used for the description of cellular automata. The initial numerical analysis revealed what was then coined "Self-Organized Criticality" and 1/ f noise, later revised to be 1/ f 2 by Jensen et al. (1989, also Christensen et al., 1991. The model itself was early on revised to display the Abelian property (Dhar, 1990), which is beneficial to both numerical and theoretical analysis. Over the years, it became increasingly clear that the sandpile model has some rather unfortunate features, in particular, that its supposed scaling behavior could never be fully determined (e.g. Manna, 1990;Lübeck and Usadel, 1997;De Menech et al., 1998;Dorn et al., 2001); The prime model of Self Organised Criticality turns out not to display much of that notorious Criticality after all. On the other hand, it offers a vast array of secondary features that had very interesting large scale properties which have been characterized analytically, such as waves (e.g. Ivashkevich et al., 1994), the average slope (e.g. Jeng et al., 2006), (static) height-height correlation function (e.g. Jeng, 2005) or solvable variants with anisotropy (Dhar and Ramaswamy, 1989). None of this work, unfortunately, makes reference to scaling of avalanches, large scale activity correlations or spatio-temporal fractals, although the sandpile model certainly carries similar visual appeal (Creutz, 2004).
As far as "real sandpiles" are concerned, experimental studies failed to detect robust scaling (Jaeger et al., 1989;Held et al., 1990), although, as one may argue, expecting otherwise would stretch the name "sandpile model" beyond its intention as aide-memoire. One should remember that even the first SOC papers discussed a coupled harmonic oscillator model as well as the sandpile. Interestingly, the ricepile experiment  and the ricepile or Oslo model  both fared much better in that respect. As far as granular media is concerned, the Oslo model has probably the best experimental support Ahlgren et al., 2002;Aegerter et al., 2003;Lőrincz and Wijngaarden, 2007).
The Oslo Model is, in fact, a representative of an entire universality class (Nakanishi and Sneppen, 1997), often referred to as the Manna universality class. Equally, the Manna Model (Manna, 1991;Dhar, 1999a) displays most clearly all features one could possibly expect from a self-organized critical model (Section 7): • Firstly, robust, reproducible finite size scaling without dependence on any control parameter or details of the definition of the model (Dickman et al., 2002), such as the underlying lattice structure (Huynh et al., 2011). • Secondly, spatio-temporal correlations, which were initially measured through integrated observables (avalanche size, duration, area, radius of gyration etc.). While temporal correlations are less of a concern (e.g. Pickering et al., 2012, for correlations on the slow time scale), spatial correlations can be extracted with some patience (McAteer et al., 2014). • Thirdly, apparent self-tuning to a critical point, that can be characterised in its own right, i.e. as a regular critical point without invoking SOC Dickman et al., 2001).
In fact, it seems that two important theoretical tools are within reach for the Manna Model: an ε-expansion (Huynh and Pruessner, 2012), and a field-theoretic description which also reveals the universality class of a tuned variant (the conserved directed percolation universality class according to Rossi et al., 2000). The Manna Model also fits the list of "ingredients" of an SOC Model in Section 7: Thresholds, intermittency and separation of time scales. The universality class of the Manna Model is remarkably large (Pruessner, 2012, p. 177-181), containing even fully deterministic models (de Sousa Vieira, 1992;Paczuski and Boettcher, 1996). Going back to 1/ f -noise as the motivation and root of SOC, Jensen (1990) introduced a fully deterministic lattice gas inspired by experimentally observed 1/ f spectra in superconductors. Simulations of this model exhibit 1/ f spectra and the dissipation take place on fractal-like structures. However, recently it was realized that the model does not display self-organization to criticality (Giometto and Jensen, 2012), but requires tuning to reach the critical point of the (conserved directed percolation) absorbing state phase transition. It is probably fair to say that despite its long history (van der Ziel, 1950) 1/ f -noise is no longer a motivation for SOC, possibly because of the confusion about its actual meaning (1/ f versus 1/ f α ) and also the possibility, at least in contemporary computer models, to characterize correlations directly in the time domain rather than indirectly via the power spectrum.

Distraction and priority
Some of the early papers in SOC paid insufficient attention to, and so may have led other people to neglect, related (and previous) relevant work. Bak and Chen openly declared that they could see little collaboration between those working on fractals and those working on 1/ f noise ("[. . . ] those working on fractal phenomena [. . . ] never [. . . ] seem to be interested in the temporal aspects, [. . . ] those working on "1/ f" noise never bother with the spatial structure of the source of the signal" Bak and Chen, 1989, as quoted in Section 4). Yet, laboratory critical phenomena already linked space and time, for example via critical slowing down, which is exactly the concept used to understand dynamical critical behaviour (e.g. Yeomans (1994)) in SOC. There was also work on the link between spatial and temporal fractality by Mandelbrot himself (Mandelbrot and Wallis, 1969) prior to his work on spatio-temporal cascades in turbulence beginning in the 1970s (Mandelbrot, 1972).
In the early days, some scientists may have perceived SOC as an aggressive foray into their established scientific fields, an attempted "hostile takeover", which contributed to the notion of "physicist hubris" (also Maddox, 1994). The Bak-Sneppen Model, for example, was introduced to the biologist audience by summarizing their own achievements and contrasting them with those of the authors: "However, there is no theory deriving the consequences of Darwin's principles for macroevolution. This is a challenge to which we are responding" (Sneppen et al., 1995, p. 5209). Plenty of similar examples can be found in the literature, some witty, some outright rude ("Is biology too difficult for biologists?" Bak, 1998). His fellow complexity scientists Cosma Shalizi and Bill Tozier at the Santa Fe Institute penned an amusing riposte to this tone in their preprint "a simple model of the evolution of simple models of evolution" (Shalizi and Tozier, 1999).

Predictability
Predictability has a somewhat ambiguous status in SOC. In their second paper, BTW argued for 1/ f to be the result of a superposition of independent avalanche durations (Bak et al., 1988a, p. 369), as originally suggested by van der Ziel (1950). In other words, independence of events was an assumption at the very foundation of SOC as an explanation of 1/ f noise. Although convenient for a straight-forward quantitative relationship between 1/ f α exponent and avalanche duration distribution, however, independence is not needed for the argument about the origin of 1/ f noise. Once introduced, the implied lack of predictability and generally contingency became an important feature, a "selling point", very early, for example in the work on earthquakes mentioned above, but also as part of the wider perspective of SOC: The [SOC] system exhibits punctuated equilibrium behavior, where periods of stasis are interrupted by intermittent bursts of activity. Since these systems are noisy, the actual events cannot be predicted; however, the statistical distribution of these events is predictable. Thus, if the tape of history were to be rerun, with slightly different random noise, the resulting outcome would be completely different. Some large catastrophic events would be avoided, but others would inevitably occur. No "quick-fix" solution can stabilize the system and prevent fluctuations. If this picture is correct for the real world, then we must accept fluctuations and change as inevitable. (Bak and Paczuski, 1995, p. 6690) Although the authors stress here that it is the noise that is inherently unpredictable, its input is "amplified" even in fully deterministic SOC systems, because of their high susceptibility to external perturbation, which is characteristic for chaotic systems (e.g. the famous "butterfly-effect") and those at a critical point: At several points the earthquake is almost dying, and its continued evolution depends on minor details of the crust of the earth far from the place of origin. Thus in order to predict the size of the earthquake, one must have extremely detailed knowledge on very minor features of the earth far from the place where the earthquake originated. (Bak and Tang, 1989, p. 15636) On the other hand, long temporal correlations, both at the fast time scale within an avalanche and the slow time scale between avalanches, mean that the system maintains a memory of past activity. In systems where a globally conserved quantity is released in sudden bursts, this is immediately obvious: A slow external drive will eventually "run out of steam" to sustain large events in quick succession. In these cases, one can expect anti-correlations (Welinder et al., 2007). In general, long temporal correlations allow for particularly good predictability, at least of big events. This does not contradict the notion of large susceptibility, which indicates that the variance of responses to an external perturbation is particularly broad. Clearly, all of these observables are probabilistic by nature.
In a Nature debate on earthquake prediction, Bak (1999) later qualified and clarified his views on predictability: [T]he earthquakes in SOC models are clustered in time and space, and therefore also reproduce the observation O4 [seismicity is not Poissonian]. This implies that the longer you have waited since the last event of a given size, the longer you still have to wait; as noted in Main's opening piece, but in sharp contrast to popular belief! [. . . ] For the longest time-scales this implies that in regions where there have been no large earthquakes for thousands or millions of years, we can expect to wait thousands or millions of years before we are going to see another one. We can 'predict' that it is relatively safe to stay in a region with little recent historical activity, as everyone knows. There is no characteristic timescale where the probability starts increasing, as would be the case if we were dealing with a periodic phenomenon.
[. . . ] Unfortunately, the size of an individual earthquake is contingent upon minor variations of the actual configuration of the crust of the Earth, as discussed in Main's introduction. Thus, any precursor state of a large event is essentially identical to a precursor state of a small event. The earthquake does not "know how large it will become", as eloquently stated by Scholz. Thus, if the crust of the earth is in a SOC state, there is a bleak future for individual earthquake prediction. On the other hand, the consequences of the spatiotemporal correlation function for time-dependent hazard calculations have so far not been fully exploited! (Bak, 1999) In the same piece, Bak also acknowledged (and rejected) a differing perception of SOC: The [SOC] phenomenon is fractal in space and time, ranging from minutes and hours to millions of years in time, and from meters to thousands of kilometers in space. This behaviour could hardly be more different from Christopher Scholz's description that "SOC refers to a global state. . . containing many earthquake generating faults with uncorrelated states" and that in the SOC state "earthquakes of any size can occur randomly anywhere at any time". (Bak, 1999) It seems that the understanding of predictability in SOC became more differentiated over time. While initially the insight prevailed that stochasticity and susceptibility made SOC systems inherently unpredictable, that view made way for a better understanding of correlations. SOC systems do not signal the onset of a large event and may not even do so while the event occurs. Yet, event sizes remain correlated over very long time, allowing probabilistic predictions, such as the likelihood of two particularly large events occurring consecutively.
In SOC-inspired research in solar physics, waiting time distributions (WTDs) are the most prominent format of predictions. They are defined as the probability density function of the waiting times between consecutive events. A Poisson process produces an exponentially decaying waiting time distribution (van Kampen, 1992), which is therefore often used as the fingerprint for a lack of correlations. However, non-stationary point processes may give rise to (apparent) power law tails in the WTD (Aschwanden, 2011, Ch. 5). In the literature WTDs based on observations of solar flares have given varying results depending on the observational period, the X-ray wavelength, and whether individual active regions are considered in the analysis.  found no correlation between the elapsed time interval between successive deka-keV solar flares arising from the same active region, and the peak intensity of the flare. This observation was taken to be in support of the solar flare SOC model by Lu and Hamilton (1991). In contrast, based on soft X-ray flare observations, Boffetta et al. (1999) found that the WTD displayed power law behavior in contradiction with the SOC model by Lu and Hamilton, which predicts Poisson-like statistics. To put this apparent mismatch in perspective, we want to emphasize that Poissonian waiting times, or more generally, lack of correlations are by no means typical in SOC. For example, the Omori-law of earthquakes (Omori, 1894;Utsu, 1961;Utsu et al., 1995) plays a very prominent rôle (Olami and Christensen, 1992;Hergarten and Neugebauer, 2002) in the analysis of the SOC model by Olami et al. (1992). One can only speculate whether the presence of Poissonian waiting times, P(t) = λ exp(−λ t) for a process with rate λ , may have been confused with a power law distribution of waiting times, in the limit of small λ (namely with large cutoff), as λ exp(−λ t) = t −1 G (λ t) with G (x) = x exp(−x), the scaling function.

Alternative scenarios
There have also been a number of successful attempts to provide alternative explanations for (apparent) critical behavior without tuning of a control-parameter. Sornette (2006) has collected a number of scenarios under which apparently critical behavior can be observed without invoking SOC. One described very early (Sornette, 1994) suggests that an ordinary critical phenomenon is causing the scaling behavior, yet no self-organization takes place beyond the system's tendency to remain close to the critical point: The system "sweeps back and forth" across the critical point in an oscillatory fashion. Peters and Neelin (2006) performed an analysis reminiscent of one done when dealing with equilibrium continuous phase transitions. They studied precipitation of rain by identifying the water vapor density as the control parameter (in analogy with the temperature in a ferromagnetic phase transition) and identified the amount of precipitation as the order parameter. In addition they plotted the variance of the precipitation, and also how frequently the atmosphere is found at a given vapor density, which they call the residence time. The outcome is a set of diagrams which exhibit many similarities to how the order parameter and susceptibility behave in standard continuous phase transitions. The precipitation picks up abruptly at a certain vapor density and in the vicinity of this density they find that the fluctuations in the precipitation (corresponding to the susceptibility) peaks.
However, as the atmosphere does not self-tune to a particular critical value of the vapor density, but rather is found in a range of vapor densities. The near critical behavior is related to the residence time having a peak near the value of the vapor density at which the precipitation has a sharp increase and the fluctuations in the precipitation peaks.
This analysis may be interpreted in the following way. The dynamics of the precipitation pulls the atmosphere around the critical value of the vapor density as vapor may build up beyond the critical value and rain showers can take the vapor density back down below the critical value. As a result the atmospheric systems moves around a certain vapor density at which precipitation becomes very likely but sharp tuning to a critical state does not take place.
A very similar analysis and scenario was found for the activity of the brain in resting state as measured by fMRI by Tagliazucchi et al. (2012). These authors analyzed the brain activity from the perspective of percolation and found that the brain moves around in the vicinity of a three dimensional percolation transition for the voxel activity measured by an fMRI scanner.
These results may suggest that at least in some situations the SOC phenomenology in reality consists of dynamics that by itself drives the system to the neighborhood of some critical transition but, which, because of coupling between the dynamics and the order parameter, is unable to fine tune to the exact critical state. In an attempt to provide a theoretical foundation of SOC, it has been argued Dickman et al., 1998, but Pruessner and) that this is a matter of system size: As the system size is increased, the dynamics is eventually "pinched" at the critical point. In the case of precipitation it appears that the order parameter (amount of rain) is able to pull the control parameter (vapor density) below the critical value and that the control parameter (due to the build up of super critical vapor densities by evaporation) can grow above the critical value. This seems to be similar to the dynamical cause (Pruessner and Jensen, 2002a) that breaks the scaling of the Drossel and Schwabl (1992) forest fire model.

SOC in the wild: how has SOC inspired research on space and fusion plasmas?
The previous section may read like a catalogue of woe, and it is important to see things in perspective. A theory as bold as SOC was bound to be controversial, so we will now balance the controversy with a very brief sketch of how the research fields of three of the authors (Chapman, Crosby and Watkins), in solar system and laboratory fusion plasmas, have been inspired by the SOC paradigm into new and productive directions. We direct the reader in search of more detail to the companion papers in this volume McAteer et al., 2014;Sharma et al., 2015), the reviews of Chapman and Watkins (2001) Several problems in space plasma physics resemble SOC. One clear example is the wideband distribution of solar flare energies, and solar flares remain one of the most intriguing examples of SOC-like behavior. Most likely caused by a magnetic instability that triggers a magnetic reconnection process in a large range of sizes and time scales, solar flares produce emission in almost all wavelengths (e.g. gammarays, hard X-rays, soft X-rays, extreme ultraviolet , Hydrogen α emission, radio wavelengths, and sometimes even in white light). Datlowe et al. (1974);Lin et al. (1984) and Dennis (1985) were some of the first to determine frequency distributions of solar flare hard X-ray observations (see Crosby et al. (1993) for a historical summary).
From micro-and nano-flares to the largest flares the flare energy power law distribution is found to cover over eight orders of magnitude (Aschwanden, 2011). The energy distribution contains all flare sizes, independently of the mechanism by which the released energy is converted. Like the Gutenberg-Richter law in seismicity these observations predated SOC, and Lu and Hamilton (1991) proposed a model based on BTW's sandpile to reproduce them. In their model each solar flare is considered an avalanche event in a critical system. The way the magnetic energy is redistributed, how the system is driven (the "loading mechanism"), and the "incorporation" of magnetohydrodynamics (MHD) have all been further developed by others, and interesting SOC-inspired variants have also been proposed such as the cascade of reconnecting loops studied by Hughes et al. (2003).
Following Lu and Hamilton (1991) several workers began analyzing frequency distributions on large solar flare datasets in the context of SOC (e.g. Crosby et al. (1993Lee et al. (1993); Georgoulis et al. (2001). Many studies also followed that used solar flare observations in other wavelengths. , for example, subdivided solar flare X-ray data according to a parameter and determined frequency distributions on the resulting sub-sets, revealing positive correlations in the parameters. In the context of model validation, observational results such as these put constraints on models that need to be able to reproduce the observations. Turning now to the Earth's local plasma environment, the magnetosphere, our readers, whether space scientists or not, will know of the dramatic auroral displays seen over Earths polar regions. These reveal a range of intricate patterns, and many phenomena have been identified in them on a wide range of temporal and spatial scales, from seconds to hours, and from one to thousands of kilometers (e.g. panels A to C of the figure in Freeman and Watkins (2002)). In the early 1990s some researchers began to focus on whether there might be "universal" aspects to auroral structure. As well as chaotic nonlinear dynamics, SOC was a natural avenue for this inquiry, and several parallel lines of attack developed. We will mention just a few papers here, a more comprehensive bibliography of early work on magnetospheric SOC can be found in Watkins et al. (2001).
One strand was experimental. Takalo et al. (1993) for example computed structure functions (as also widely used in turbulence research and surface growth) on the auroral electrojet (AE) index 17 . They found a scaling region between about 1 minute and 2 hours. The scale break above 2 hours was attributed to the quasi-periodic interruption of the time series by a global scale auroral disturbance, the magnetospheric substorm.
A complementary theoretical thread took several forms. The strand most directly inspired by sandpile models initially involved pointing out the similarities between key properties of AE, determined by power spectral (e.g. Consolini (1997); Uritsky and Pudovkin (1998)) and threshold exceedence (e.g. Consolini (1997)) techniques, and those of existing sandpile models, both BTW's and the running sandpile model of Hwa and Kardar (1989). The pioneering work on power spectra of AE by Tsurutani et al. (1990) that showed it to exhibit a low frequency "1/f" region, was now argued to be indicative of SOC. Two new measurements directly inspired by SOC were the probability density of the time for which the AE index exceeded any given fixed threshold, the "burst duration", and burst size (the integrated value above the threshold for each burst). Both were found to have fat tailed pdf's (e.g. (Consolini, 1997)), for bursts from the minimum measurement scale of 1 min to the longest burst lifetimes (of order 1 day). Subsequent work (e.g. (Freeman et al., 2000)) for both burst duration found that superposed on the fat tail was another component centered on a fixed scale of about 100 min, corresponding to the global substorm phenomenon.
In parallel with the above developments in space physics, SOC had also already been fruitful in fusion research, where the wider properties and dynamics of avalanching systems are of interest in addition to their statistical properties. It had been noted ( e.g. Dendy et al. (2007)) that magnetically confined tokamak plasma experiments for fusion are driven, dissipative systems with multiple steady states, anomalous transport, and bursty release of energy and material. This prompted the development and extensive study of several sandpile/avalanche models (surveyed in Dendy and Helander (1997);Perrone et al. (2013)) in the fusion context, specifically to reproduce key observables which are not necessarily power law avalanche distributions. A key observable in tokamaks is the correlation between the distinct statistical properties of bursty energy release (edge localised modes) and the confinement state of the plasma (low and high confinement states, or L and H modes). This essential property was captured in a "sandpile with an H mode" . This model also captures aspects of anomalous transport in tokamaks, for example the observed, unexpected, inward transport against the temperature gradient. This fusion-relevant model directly followed from one developed (Chapman et al., 1998) to explore the role of SOC in magnetospheric substorms, and is an example of transfer of ideas from one research area to another and back. The CDH model (Chapman et al., 1998) which could be consistent both with fat tailed ionospheric energy dissipation events, and with magnetospheric events with a characteristic size provided that they were systemwide events like the substorm, was inspired by work on inertial sandpiles for tokamaks Helander, 1998), Following Chapman et al. (1998) a more direct observational test was suggested in Lui et al. (2000), who proposed the use of a threshold exceedance measure to investigate the spatial structure of the aurora. Using UV images of the aurora from cameras on the Polar spacecraft, Lui et al. (2000) identified auroral blobs, where the auroral emission intensity exceeded some fixed threshold, during both quiet and substorm intervals. As in the AE index time series analysis, a fat tailed pdf was found both for number of threshold exceedances and their areas, with an additional population centred on a fixed scale corresponding to the global substorm disturbance. It was also realised that unlike the ideal SOC paradigm, that in such driven dissipative astrophysical confinement systems the driving would be highly variable, leading to studies of the extent to which the fat tailed avalanche distribution was robust against this (Watkins et al., 1999).
However, Uritsky et al. (2002) argued that the Lui et al. (2000) approach overestimated the number of spatio-temporally evolving blobs, because a blob counted in one image at one time could be the same one counted at another time. These authors thus analysed Polar images from spatiotemporally, and claimed that pdfs of maximum blob area or integrated area over blob lifetime followed power law distributions over the entire observable range (3-5 orders of magnitude), and similarly for the blob lifetime, maximum dissipated power and dissipated energy (see also Freeman and Watkins (2002)).
The conceptual parallel between avalanches in SOC models with those apparently observed in the aurora is appealing, but an immediate complication resulted from the fact the aurora is a projection of the dynamic charged particle structure of the near-Earth magnetosphere. Because satellite measurements in the tail region of the magnetosphere have shown bursty bulk flows of charged particles that may be individually correlated with auroral emissions, and may have a scale-free distribution of durations, it was argued Lui et al. (2000); Uritsky et al. (2002) that these were, essentially, the avalanches.
A somewhat different, and complementary scenario to BTW's SOC for the dynamic structure of the magnetosphere was however suggested by Chang (1992). In his picture, plasma wave resonances create coherent structures of various sizes that merge and interact to create new structures. He proposed that continual interactions of this type may naturally self-organize or be forced into a scale-free hierarchy of coherent structures like the ordering of spin structures in the Ising model at the critical point. In his view the distinction between self-organized and forced criticality is essentially about the nature of the thing that drives the system (the solar wind in the case of the magnetosphere). As we have shown, in BTW's SOC, the driving rate is necessarily very slow compared to the interaction and merging time scales. Intriguingly the opposite behaviour was predicted for some other non-equilibrium models where the onset of criticality appears above some driving rate (Nicolis and Malek-Mansour, 1984), and something analogous is also seen in turbulence where the onset of complex behaviour occurs above a given value of the Reynolds number. Chapman and Watkins (2009);Chapman et al. (2009) have clarified this behaviour by noticing that the dimensionless control parameter formed by fuelling rate and dissipation rate in SOC models is effectively an inverse Reynolds number.
Consideration of the driver has however, as elsewhere in complexity research, raised a thorny issue: The supply of energy from the solar wind into the magnetosphere has itself a fractal flavour, because the solar wind is turbulent. Studies (Freeman et al., 2000) using static measures of fractal property in long-of order years-non-overlapping solar wind and auroral time series suggested that, at least for the AE index, the scale free behaviour might originate in the solar wind, rather than be self-organised in the magnetotail. Comparisons using time-dependent measures on shorter-of order months or less-but overlapping, series however (Uritsky et al., 2001) indicated that an internally generated scale-free component may coexist with a solar wind. Debate on this topic has continued, and is not unique to the magnetosphere, but is reminiscent, for example of the debate in theories of punctuated evolution between the influence of "external" events (such as asteroid impact) on extinctions and self-organised "internal" extinctions.
Even without an SOC origin, power law distributions can be used to estimate the maximum strength of natural hazards and are increasingly being used by reinsurance companies and governments to assess the risks they pose. The space industry is no exception to this trend, as when building spacecraft such information is essential when designing the spacecraft shielding which mitigates against extreme events as well as the long-term effects of space weather.

Summary and conclusion
Readers who have made it to the end of this article may now appreciate why our first epigraph quoted the Dude from "The Big Lebowski", as untangling the history, meaning, and current status of SOC really has required the reader (and authors) to keep track of a "lotta strands". This is made even harder by the diversity of the research fields in which these strands originate, all of which have not only their own notations and traditions, but also very different ideas about what a good model is, and how to wield Occam's razor ! However, we hope we have also brought out the reasons why our second epigraph quoted physics Nobelist Philip Anderson, who described SOC as of "paradigmatic value, as the kind of generalization which will characterize the next stage of physics". In our concluding section we now try to draw out two specific issues, about the current status of the SOC conjecture and accompanying theory and the testability of SOC in space and lab plasmas respectively, and give our views on these.

SOC theory: where do we stand ?
SOC was conceived by Bak, Tang and Wiesenfeld against the background of condensed matter theory, statistical mechanics and, to lesser extent, dynamical systems, with the intention to explain spatio-temporal fractals in nature. The initial core claim, that some spatio-temporal fractals (i.e. long time and range correlations) are produced by systems that are organizing themselves to a continuous phase transition, where such correlations are typical, was soon extended to encompass a much greater spectrum of phenomena. Considerable confusion has grown over the years as to what has been established by and about SOC, to what extent it has been confirmed analytically, numerically, by observation in nature or experimentally, where it applies and what it aims to explain.
There are few systems that display SOC in all its glory, but they do exist and they provide clear evidence that it works in precisely the way originally envisaged. SOC may be at work in some natural phenomena, such as earthquakes, solar flares and precipitation, but SOC is almost certainly not ubiquitous. To some, more traditionally-minded communities, in particular in condensed matter theory, the phenomenon of SOC nevertheless comes as a great surprise, as spontaneous non-trivial scaling in this area is otherwise confined to systems displaying generic scale invariance, without intermittency or self-organization to a critical point, and invariably requiring some scale-free source or input, such as noise.
Despite being hampered by re-interpretations not originally intended by its authors (and sometimes because of these !), SOC has inspired much research into multiscale phenomena and has helped bring together disjoint communities, in particular those interested in heavy tails, spatio-temporal fractals and 1/ f noise. All of these were known to specialists (e.g. van der Ziel, 1950;Schick and Verveen, 1974;Weissman, 1988). but all had relatively low cross-disciplinary visibility before SOC, as the authors can testify. In the long term this may be one of the most important legacies of the subject.
While, in some of these areas, the strict definition of SOC has given way to a broader view and sometimes sweeping claims, it has also provided the very fruitful paradigm for a much deeper understanding of the phenomena concerned, as researchers became aware of the distinct possibility that some very simple interactions on a microscopic scale carry over to and evolve across many different time and length scales, effectively providing the same basic physics in rescaled form across many scales. In that sense, SOC realized the aspirations and exhortations of Anderson (1972) and Wilson (1979) in that it provided a framework to ask questions about the crucial, effective, simple interactions that are present across all scales of a multiscale phenomenon, and which must therefore be present, detectable and describable (in bare, unscaled form) at some small scale. SOC stripped away the need for a detailed microscopic physics and gave way to a more global perspective of the basic physical principles that govern a phenomenon on every scale. This perspective of looking for the basic interaction that governs a physical system across scales is different from classic reductionism, which suggests that the overall phenomenon is some averaged version of the internal dynamics. SOC suggests that the interaction is present on all scales, although in some scaled form, as it slowly morphs and evolves in space and time. In that respect, SOC provides a much sharper quantitative emphasis than some of the more recent complexity-inspired points of view.
Broad, heavy tailed distributions and correlations, whether or not they can justifiably be called power laws, and regardless of whether they are indicative of scaling, are observed in many field and pose a challenge. This is because they suggest that phenomena are not confined to a particular length scale and that the physics driving them manages to cross scales. To understand them better would allow a better quan-titative characterization of fluctuations and associated risks and is likely to point at the relevant underlying physics. SOC is one such attempt at a better understanding. With all its flaws and shortcomings, it is difficult to identify a more successful approach.

Testing for SOC in space plasmas
Having clarified our best current understanding of what an SOC state is, the other key question is then, are these solar flare, and other interesting plasma systems, really in this SOC state? Testing for SOC has mainly been centered on testing for power law statistics of event sizes. Observationally this presents a fundamental challenge, as the confined plasma systems are of finite size. The solar corona offers the broadest range of spatial scales and indeed, here we see power laws over up to 8 decades. Probability distributions of different auroral "spot" variables (observed in earth's ionosphere), constructed using the results of ground-based and satellite camera observations, extend across more than two orders of magnitude in space (Kozelov et al., 2004), enabling the derived burst variables (which convolve a time variable) such as size, duration and so forth to span a much larger range . This distinction between spatial and spatiotemporal scaling ranges has been well known for some time, see e.g. Avnir et al. (1998). A second challenge is that the developing understanding of how to precisely test for SOC, as discussed above, has "raised the bar" in terms of what is required for a truly convincing demonstration. Solid data of spatio-temporal correlations remain out of reach and thus global measures, such as spatial (activity) integrals have to be used. We will only touch on some points here, see also (McAteer et al., 2014) First, it is important to distinguish SOC, or indeed, multiscale avalanching, from turbulence, and this follows from the intrinsic separation of timescales in these systems. The idealized SOC limit is when the ratio of driving and dissipation is taken arbitrarily small, and this is in the opposite sense to turbulence Chapman et al., 2009). The finite size of these systems makes distinguishing SOC and turbulence in these plasmas from observations of the scaling properties alone a challenge and this has led to controversy Watkins et al., 2009). Second, one must exclude "trivial" similarity that can occur in linear systems. A simple Brownian walk is self-similar but does not imply spatio-temporal correlations. An example of a spatially extended system that incorporates dynamics is the Edwards-Wilkinson model (Edwards and Wilkinson, 1982), where grains are randomly dropped onto a surface which is smoothed by linear spatial diffusion. In such a model one observes power laws in the sizes of patches of the surface where the height exceeds a threshold, however the model is linear and in that sense trivial (Chapman et al., 2004, also Appendix A). However, even escaping triviality and linearity, MHD plasmas, along with hydrodynamics, exhibit similarity in their non-linear dynamics, yet are certainly not instances of SOC. Classic examples are non-linear Alfven waves, shocks and solitons. It therefore does not suffice to look for non-trivial power law event size statistics per se as the "hallmark" of SOC. As discussed in section 1.7 it is necessary, but not sufficient.
Given the difficulties inherent in observational verification of (idealized) selfsimilarity, alongside the clear evidence for multiscale bursty energy release and the simultaneous operation of a zoo of plasma processes operating on multiple spatiotemporal scales which are strongly coupled to each other, the original SOC paradigm can be said to have "mutated" into a broader concept of "multiscale avalanching" plasma systems. As a concept around which to order the observations, multiscale avalanching has been a remarkable success. Without the concepts of plasmas as multiscale systems (e.g. Chang, 1992), phenomenologists would still be restricted to the detailed plasma physics of an energy release event in isolation. Avalanching, that is, bursty transport and energy release events on multiple scales, is observed to be ubiquitous in driven, dissipative plasmas, and involves fully non-linear physics, coupling across multiple scales. It remains an open, and highly topical problem across astrophysical and laboratory plasmas.

Appendix A: Dimensional analysis, scaling and self-similarity
Scaling is a continuous symmetry obeyed by certain physical observables. It relates, quantitatively, the value of a physical observable at one set of parameters to its value at another set of parameters. It comes in two forms: The trivial form is obtained by a dimensional analysis, the non-trivial form is the manifestation of self-similarity, identified most prominently by the renormalization group, but also accessible numerically and by data analysis (such as a data collapse).
Scaling is a very powerful concept in physics, because it allows the analysis of a phenomenon on a vastly different scale than it is observed on. What is more, the same fundamental physics must be at work at very different scales, which often leads to very deep insights. The fact, for example, that the electrical force between two charges decays like 1/r 2 carries the signature of the dimensionality of the space around us, d − 1 = 2, and is explained within the framework of Quantum Electrodynamics by the masslessness of the photon.
Scaling can be applied amazingly broadly, as demonstrated in Buckingham's (1914) Π theorem which introduced the method of dimensional analysis more than 100 years ago. The scaling determined by dimensional analysis is often referred to as trivial: It is an unavoidable consequence of finding, imposing or assuming a certain physical reality. For example, assuming that the frequency ω of a frictionless mathematical pendulum depends only on its length ℓ, its mass m and the gravitational acceleration g has the immediate consequence that it must necessarily be a constant multiple of g/ℓ.
In the small angle approximation the constant is unity -but even without the small angle approximation and thus allowing for dependence on the amplitude φ 0 , dimensional analysis tells us that the frequency must be of the form ω = f (φ 0 ) g/ℓ, where f (φ 0 ) is an (a priori unknown) function of φ 0 . By dimensional analysis, the observable ω obeys the remarkable symmetry ω(φ 0 , ℓ, g) = T −1 ω(φ 0 , L −1 ℓ, L −1 T 2 g) for all finite, positive, real T and L. In particular, by choosing T = ℓ/g and L = ℓ, ω(φ 0 , ℓ, g) = g ℓ ω(φ 0 , 1, 1) and we can identify f (φ 0 ) = ω(φ 0 , 1, 1). Dimensional analysis can only ever give rise to trivial exponents, usually integers or simple fractions, which are a necessary consequence of the dimension of the quantities used to describe the physical phenomenon and the assumption that the physical reality of a phenomenon is independent from the choice of units used to describe it. 18 We notice that "trivial" is a loaded word, but it is a technical term, used to point to the fact that the scaling obtained is identical to that found in a system without considering the effect of non-linearities (which otherwise make "all the music"). In such a linear system, solutions can be superimposed, suggesting a lack of interaction. One such solution may be the trivial solution, and by association the linear system and its exponents are therefore called trivial. The term "trivial" obscures the fact that there are famous examples of dimensional analysis producing powerful and far-reaching results, such as Kolmogorov's (1941) 5/3 law. Yet, the deep insight does not consist in the dimensional analysis, but in determining the physical quantities that enter into a physical phenomenon. Dimensional analysis is only a relatively straight-forward manifestation of that achievement.
Trivial scaling does not produce the richness and variety of power laws found in nature. For example, the fractal dimension of percolating clusters on a square lattice is 91/48 (Stauffer and Aharony, 1994). This is generally possible in the presence of dimensionless, finite ratios involving the characteristic length or distance (or, more generally, time) the system is studied under. In case of the pendulum mentioned above, for example, the initial condition might be expressed as an initial displacement d of the pendulum, so that φ 0 = sin −1 (d/ℓ). In that case, it is no longer obvious how ω scales in ℓ, everything is possible, at least in principle. 19 The presence of non-trivial power law spatio-temporal correlations indicates on the one hand that competing scales are present (otherwise exponents are determined by dimensional analysis), on the other hand that they do not dominate the behavior of the system, in the sense that their physics does not take over on large spatiotemporal scales. Rather, they compete with and balance each other. Otherwise a characteristic scale appears, with one "physics" below and one "physics" above 18 Expressing distances in units of time is an everyday example, "It's four hours to Washington." vs "It's 260 miles to Washington." (Pruessner, 2004b). 19 To claim that the frequency of a pendulum depends on the absolute value of its initial displacement suggests a physics different from the one where the frequency depends only on the initial angle. that scale. 20 A characteristic scale is present, for example when spatio-temporal correlations decay (asymptotically) exponentially, say C(r) = C 0 exp(−r/ξ ) with some characteristic scale ξ and unknown amplitude C 0 . The system "knows" the scale and it can be determined from within, for example as the inverse slope of the plot of log (C(r)/C(2r)) = r/ξ against r.
When correlations decay like a power law, no such manipulation is possible, say C(r) = C 0 (r/ξ ) −µ with some unknown exponent µ. Without knowing C 0 the length ξ cannot be extracted. For example C(r)/C(2r) = 2 µ . In the presence of power law correlations, there is a lack of scale from within, in other words, the system is self-similar. Self-similarity generally manifests itself in the non-trivial scaling of correlation functions.
Power law correlations are typically observed at transitions, also in dynamical systems, when a fixed point changes stability. They have been extensively studied within the field of critical phenomena ever since Onsager's (1944) solution of the two-dimensional Ising Model suggested a clash with Landau-theory (Stanley, 1971), which produces the same exponents as dimensional analysis suggests. Power law correlations are generally observed at continuous transitions, a smooth change of phase as opposed to the sharp, so-called first order phase transitions as seen when water boils in a kettle. They can be observed in the spectacular display of critical opalescence in carbon dioxide (Stanley, 1971).
As pointed out several times above, correlation functions are difficult to measure directly, in particular in SOC models which often have open boundaries and are therefore not translationally invariant. As a result, correlation functions depend not only on the distance, but on absolute coordinates, so that spatial averaging is not possible. Moreover, correlations are often very weak, so that the indirect, integrated measures mentioned above, such as the avalanche sizes and durations, often show clearer signs of scaling.
The exponents characterizing the power laws are usually expected to be universal, i.e. they are the same in vastly different systems. Because they are characteristics of asymptotics, which in turn a determined by basic features of the interactions, exponents are intricately linked to the symmetries of the interactions and the system as a whole. It was one of the great insights of the renormalization group (Wilson, 1971(Wilson, , 1979 that exponents are characteristics of the symmetries involved. Results are particularly strong for phase transitions in two-dimensional equilibrium systems with discrete symmetries: Conformal field theory (Langlands et al., 1992;Cardy, 1992) and Stochastic Loewner Evolution (Lawler et al., 2001;Smirnov and Werner, 2001) were able to demonstrate that there are exactly 6 universality classes (Fogedby, 2009).
Exponents are not the only universal quantities. Also universal are amplitude ratios, moment ratios and scaling functions, although in the case of finite size scaling they often depend on boundary conditions (Barber, 1983;Privman et al., 1991). Traditional critical phenomena consider scaling of an observable in an infinite system as a function of some control parameter, say the magnetization density m as a function of the temperature difference to the critical value T − T c , observing m ∝ (T c − T ) β for T < T c . Alternatively, a system may be tuned to the critical point and the scaling of a correlation function is studied as a function of the distance. At the critical point, system wide observables can be studied for their dependence on the system size, known as finite size scaling, say m ∝ L −β /ν , for a system with linear extent L. In both cases, a length scale (such as the distance or the size of the system) controls the scaling of the observable in a system right at the critical point. Leaving correlation functions aside, finite size scaling of some global observables is the only scaling displayed by SOC. For example, the cutoff s c in the avalanche size distribution scales like s c ∝ L D as a function of L. The exponent D, the fractal dimension of the characteristic avalanche size, is expected to be universal.
While exponents are normally not independent, because they are related by scaling relations, other quantities are, yet the power of universality ties them up in a universality class. Only a few universality classes are expected to exist, so determining a very small number of universal quantities determines the whole class.
That is what BTW were envisaging for SOC; universality justifies the study of toy models: We choose the simplest possible models rather than wholly realistic and therefore complex models of actual physical systems. Besides our expectation that the overall qualitative features are captured in this way, it is certainly possible that quantitative properties (such as scaling exponents) may apply to more realistic situations, since the system operates at a critical point where universality may apply. The philosophy is analogous to that of equilibrium statistical physics where results are based on Ising models (and Heisenberg models, etc.) which have only the symmetry in common with real systems. Our "Ising models" are discrete cellular automata, which are much simpler to study than continuous partial differential equations. (Bak et al., 1988a, p. 365) The early literature responded to that call for universality by offering models in the "BTW universality class" (Zhang, 1989;Manna, 1991). The focus quickly shifted to introducing new universality classes, in particular by breaking symmetries which were thought to be crucial (e.g. Drossel and Schwabl, 1992) and later provided an ordering principle Milshtein et al., 1998;Biham et al., 2001;Hughes and Paczuski, 2002;Karmakar et al., 2005). One reason why the subject of SOC remains contentious is the richness of the results found. They often do not fall clearly in one universality class, in fact, even on the basis of extensive computer simulations it is often not even possible to determine whether scaling takes place at all, let alone the exponents (e.g. Dorn et al., 2001). The two computer models, which so far display the clearest evidence for SOC, the Manna (Manna, 1991) and the Oslo Model , are in the same universality class (Nakanishi and Sneppen, 1997), clearly not for trivial reasons.