Introduction: Why PCA in Astrobiology?

The early works of Stanislaw Ulam and John von Neumann in the 1940s, the book entitled “Calculating space” by Zuse (1969), John Conway’s popular Game of Life (described in Gardner 1970), and the first systematic analysis of Stephen Wolfram in 1983–84 (Wolfram 1983, 1984), with progressing computational power over the past decades established a new approach—or philosophy—in making scientific models of various phenomena. Cellular automata (CA) modeling techniques are increasingly gaining momentum in studies of complex systems and their unpredictable behaviour. The CA operates on a lattice of cells in discrete time steps. Each cell is characterized by a state which evolves in time according to transition rules. Transition rules define the state of the cell in the next time step in relation to the present state of the cell itself and the states of the cells in its surrounding (neighbourhood).Footnote 1 Even simple transition rules can result in a substantial complexity of emerging behaviour (for details on CA theory see Ilachinski 2001). Despite the fact that deterministic CA can create some random-like patterns, probabilistic cellular automata (PCA) are more convenient tool to be used in discrete modeling of intrinsically stochastic phenomena.

Probabilistic cellular automata have been studied extensively (Bennett and Grinstein 1985; Grinstein et al. 1985) and have shown good results in practice as a lucrative modeling tool in many fields of science and technology (e.g., modeling of forest fires, pandemics, immune response, urban traffic, etc. (Batty et al. 1997; Hoya White et al. 2006; Soares-Filho et al. 2002; Torrens 2000)). In particular, the application to biological sciences gives us a better explanation of the microscopic mechanisms that lead to the macroscopic behavior of the relevant systems (Borkowski 2009; de Oliveira 2002; Wood et al. 2006). These models are simple and yet exhibit very intricate behavior—not yet well understood—partly as a consequence of taking into account the fluctuations that play an important role in determining the critical behavior of the system considered. One important feature of almost all these models is the presence of phase transitions, which have potential to explain a wide variety of phenomenological features of biological and ecological systems (Bak and Boettcher 1997; Bak and Paczuski 1997; Langton 1990; Wood et al. 2006). This is crucial for our attempt to extend the domain of numerical simulations to astrobiology.

Astrobiology is a nascent multidisciplinary field, which deals with the three canonical questions: How does life begin and develop in its widest cosmical context? Does life exist elsewhere in the universe? What is the future of life on Earth and in space? (Des Marais and Walter 1999; Grinspoon 2003; Chyba and Hand 2005) A host of important discoveries has been made during the last decade or so, the most important certainly being the discovery of a large number of extrasolar planets; the existence of many extremophile organisms, some of which possibly comprise the “deep hot biosphere” of Thomas Gold; others are living at altitude up to 41 km in the stratosphere; the discovery of subsurface water on Mars and the huge ocean on Europa, and possibly also Ganymede and Callisto; the unequivocal discovery of many amino-acids and other complex organic compounds in meteorites; modeling organic chemistry in Titan’s atmosphere; the quantitative treatment of the Galactic Habitable Zone (GHZ); the development of a new generation of panspermia theories, spurred by experimental verification that even terrestrial microorganisms easily survive conditions of an asteroidal or a cometary impact; progress in methodology of the Search for ExtraTerrestrial Intelligence (SETI) studies, etc. In spite of all this lively research activity, there have been so far surprisingly few attempts at building detailed numerical models and quantitative theoretical frameworks which would permit an understanding of the accumulating empirical data or for adding rigor to the many hand-waving hypotheses which are thrown around. Some of the excellent exceptions to this are studies of Lineweaver and collaborators (Lineweaver 2001; Lineweaver and Davis 2002; Lineweaver et al. 2004), on the age distribution of Earthlike planets and the structure of the Galactic Habitable Zone (Gonzalez et al. 2001). It is sometimes stated that we understand the underlying “astrobiological dynamics” still so poorly; while that is undoubtedly true, there have been well-documented cases in the history of physical science (including the paradigmatical case of neutron diffusion through a metallic shield which was investigated by Ulam and von Neumann) which demonstrate that various possible local dynamical behaviours converged toward similar globally interesting physical outcomes. It is exactly this motivation which prompts us to suggest the usage of PCA in studying astrobiological complexity and to show that this can offer us interesting, though unavoidably very preliminary, insights.

(Some quantitative models have been developed in order to justify or criticize particular SETI approaches. In general, they follow one of the two schools of thought from the early 1980s, being either (1) based on some extension of biogeography equations, starting with Newman and Sagan (1981), and recently used by Bjørk (2007); or (2) making use of discrete modeling, starting with the work of (Jones 1981), and developed in rather limited form by Landis, Kinouchi, and others (Landis 1998; Kinouchi 2001; Cotta and Morales 2009; Bezsudnov and Snarskii 2010). The latter studies used particular aspects of the discrete approach to astrobiology, but have not provided a comprehensive grounding for using such models rather than other, often more developed numerical tools. In addition, they have of necessity been limited by either arbitrary or vague boundary conditions and the lack of specific astrophysical input dealing with the distribution of matter in the Milky Way and possible risk factors. While we use the results of the latter, “discrete” school of thought as a benchmark (in particular those of Cotta and Morales 2009), we attempt to show how they could be generalized to a wider scheme, encompassing not only SETI, but much more general issues of astrobiological complexity.)

The plan of the paper is as follows. In the remainder of the Introduction we review some of the motivations for a digital perspective on astrobiology in general, and the usage of PCA in particular. In section “Probabilistic Model of the GHZ” our probabilistic model of astrobiological complexity of the Galaxy is introduced and its main results reviewed. In section “Fermi’s Paradox as a Boundary Condition” we discuss the key issue of boundary conditions, especially in their relationship to Fermi’s paradox and biological contingency argument. Finally, in the concluding section, we summarize our main results and indicate directions for future improvement.

Discrete Nature of the Distribution of Matter

While the present approach uses global symmetries of the Galactic system (planarity, thin disk, thick disk, spiral arms, etc.), one should keep in mind that the realistic distribution of baryonic matter is discrete. In particular, stars possessing habitable planets are hypothesized to form a well-defined structure, the GHZ: an annular ring-shaped subset of the thin disk. Since the lifeforms we are searching for, depend on the existence and properties of habitable Earth-like planets, the separation of the order of ∼1 pc between the neighboring planetary systems (characterizing the GHZ) ensures that, even if some exchange of biologically relevant matter between the planetary systems occurs before the possible emergence of technological, star-faring species (as in classical panspermia theories), it remains a very small effect, so the assumption of discrete distribution holds. Even if an advanced technological civilization arises eventually and engages in interstellar travel or decides to live in habitats independent of Earth-like planets, it is to be expected that their distribution will stay discrete for quite a long time, since the resources necessary for interstellar travel (likely to be expensive at all epochs) will remain distributed around Main Sequence stars.

Contingency in Biological Sciences

The issues of determinism vs. indeterminism and contingency vs. convergence in biological sciences has been a very hotly debated one ever since Darwin and Wallace published their theories of evolution through natural selection in 1859. One of the main opposing views has in recent years been put forward by proponents of contingent macroevolution, such as Gould (1989, 1996) or McShea (1998). According to this view, the contingent nature of biological evolution guarantees that the outcome is essentially random and unrepeatable. When this essential randomness is coupled with the stochastic nature of external physical changes, especially dramatic episodes of mass extinctions, we end up with a picture where the relative frequency of whatever biological trait (including intelligence, tool-making and other pre-requisites for advanced technological civilization) is, in a sufficiently large ensemble, proportional only to the relative size of the relevant region of morphological space. While proponents of this view do not explicitly mention astrobiology, it is clear that the required ensemble can be provided only in the astrobiological context (Fry 2000), notably by GHZ. (Of course, the definition of morphological space hinges on the common biochemical basis of life, although it does not seem impossible to envision a generalization.)

On the diametrically opposite end of spectrum, Conway Morris (Conway Morris 1998, 2003) argues that convergent processes led to the current general landscape of the terrestrial biosphere, including the emergence of intelligence in primates. (Dawkins 1989) and (Dennett 1995) are certainly closer to this position, though they are somewhat reserved with respect to Conway Morris’ unabashed anthropocentrism (see also Sterelny 2005). For some of the other discussions in a voluminous literature on the subject see Simpson (1949), Raup (1991), Adami et al. (2000), and Radick (2000).

For the present purpose, we need to emphasize that, while the issue of fundamental determinism or indeterminism is a metaphysical one, in practice even perfectly deterministic processes (like asteroidal motions or Buffon’s matchsticks) are often successfully modeled by stochastic methods. Even the fervent supporters of convergence admit that there is much variation between the actual realizations of the firmly fixed large-scale trends, allowing a lot of margin for stochastic models. In the context of researching SETI targets, for instance, the relative difference in timescales of 106 − 107 yrs makes quite different accounts, although it could be argued that it is just a small-scale perturbation or straying from the broadly set evolutionary pathway. Recent studies, such as the one of Borkowski (2009), show that macroevolutionary trends on Earth can be successfully described exactly within the framework of the cellular automata models.

Stepwise Change in Evolution

Carter (1983, 2008), Hanson (1998), Knoll and Bambach (2000) and other authors emphasize a number of crucial steps necessary for noogenesis. Some of the examples include the appearance of the “Last Common Ancestor”, prokaryote diversification, multicellularity, up to and including noogenesis. These crucial steps (“megatrajectories” in terms of Knoll and Bambach 2000) might not be intrinsically stochastic, but our present understanding of the conditions and physico-chemical processes leading to their completion is so poor that we might wish to start the large-scale modeling with only broadly constrained Monte Carlo simulations. Subsequent improvement in our knowledge will be easily accommodated in such a framework (see also section “Framework Adaptable to Future Observations and Results” below). This applies to any list of such steps (the problem, as Carter emphasized, is that the number of really critical steps is quite controversial). The work of Pérez-Mercader (2002) shows how scaling laws can be applied to the problem of the emergence of complexity in astrobiology; in a sense, the present study is continuation and extension of that work.

It is important to understand two different senses in which we encounter stepwise changes in the astrobiological domain. In one sense, we encounter models of punctuated equilibrium attempting to explain the discrete changes in evolution, including possibly catastrophic mass extinctions (e.g., Bak and Boettcher 1997). On the other hand, megatrajectories can be generalized to cases in which we are dealing with intentional actions, such as those which are interesting from the point of view of SETI studies. Both of them highlight the advantage of the discrete models like PCA over some of the numerical work published in the literature, usually in the context of SETI studies. For instance, Bjørk (2007) calculates the rate of colonization of planetary systems in the Galaxy under relatively restricted conditions. This approach, pioneered by Newman and Sagan (1981) uses just a small part of the possible space of states regarding capacities of advanced technological evolution. If, as warned by the great historian of science, Steven J. Dick, postbiological evolution is the dominant general mode of evolution in the last megatrajectory (Dick 2003, 2008), many of the concerns of SETI models based on continuous approximations become obsolete (see the criticism in Ćirković and Bradbury 2006). On the other hand, stepwise changes and phase transitions are generic features of a large class of PCA (Petersen and Alstrom 1997).

Important Global Tendencies and Redundant Local Information

This methodological proviso—historically the all-important motivation behind von Neumann’s introduction of stochastic models in physics—provides a rationale for similar simulations in other fields of life sciences, in particular ecology (Soares-Filho et al. 2002) or epidemiology (Hoya White et al. 2006), or even urban traffic (Batty et al. 1997). Even more to the point of the specifics of astrobiology, PCA have recently been successfully used in the “Daisyworld” models (Wood et al. 2006), which share much of the complexity of the models of the GHZ described below. We do not need to know specific details of biogenesis, noogenesis and other processes on a particular planet in GHZ in order to get a global picture of the GHZ evolution and argue for or against particular research programs, for instance, for or against a specific SETI targeting project.

This applies to temporal, as well as the spatial scales. Research on both past and future of the universe (classical cosmology and physical eschatology) demonstrates clearly defined timescales, which could be treated as discrete units. Even lacking the detailed information on the GHZ census at any particular epoch, we might still wish to be able to say something on the overall tendencies up to this day and into the foreseeable future. This is analogous to the cases in which global tendencies of complex systems are sought with evolutionary computation algorithms (de Oliveira 2002).

All this should be considered in light of the breakdown of the long-held “closed-box view” of evolution of local biospheres of habitable planets. In both astrobiology and the Earth sciences, such a paradigm shift toward an interconnected, complex view of our planet, has already been present for quite some time in both empirical and theoretical work. In particular, possible influences of Earth’s cosmic environment on climate (Carslaw et al. 2002; Pavlov et al. 2005), impact catastrophes (Clube and Napier 1990; Clube 1992; Asher et al. 1994; Matese and Whitmire 1996; Matese et al. 1998), biogenesis (Cockell 2000; Cockell et al. 2003), or even biotic transfer (Napier 2004, 2007; Wallis and Wickramasinghe 2004; Wallis et al. 2008) have become legitimate and very active subjects of astrobiological research. Thus, it is desirable to be able to consider them within an integrative view, assigning them at least nominal quantitative values, to be substituted by better supported data in the future.

Framework Adaptable to Future Observations and Results

PCA models in general rely on input matrix of probabilities (of transitions between internal states). This makes such models a very flexible tool, since such a matrix can be fitted to any number of future observations, as well as conceptual innovations and theoretical elaborations. In particular, it is to be expected that on-going or near-future space-based missions, like DARWIN (Cockell et al. 2009) or GAIA (Perryman et al. 2001), will provide additional constrains on the input matrix of probabilities. The same applies to future theoretical breakthroughs, for instance the detailed modeling of the ecological impact of intermittent bursts of high-energy cosmic rays or hard electromagnetic radiation. This will be accompanied by “fine-graining” of the automaton states and of the network of transitional probabilities.

Historically Used Probabilistic Arguments in SETI Debates

Many arguments used in SETI debates have been based on probabilistic reasoning, the most important being the “anthropic” argument of Carter (1983). For elaborations on the same topic see Lineweaver and Davis (2002), Davies (2003), Ćirković et al. (2009), etc. Remaining in this same context offers clear advantages in being able to account for various phenomena suggested as dominant and get a historical perspective to this extremely rich discussion. While this is a general argument for applying the entire class of Monte Carlo simulations to astrobiological problems, something could be said for the particular PCA implementations of numerical models. The wealth of existing knowledge on different PCA applications is immensely useful when approaching a manifestly complex and multidisciplinary field such as astrobiology and SETI studies.

Actually, it might be interesting (and historically sobering) to notice that one of the fathers of evolutionary theory, Alfred Russel Wallace, was a forerunner of astrobiology. He has actually argued for the uniqueness of the Earth and humankind on the basis of the cosmological model in which the Sun was located near the center of the Milky Way similar to the long-defunct Kapteyn universe (Wallace 1903). This remarkable, although incorrect, argument demonstrates how important astrophysical understanding has been since the very beginning of scientific debates on life and intelligence elsewhere.

The SETI debate has, in the course of the last 4 decades, been dominated by analysis of the Drake equation (Drake 1965), which in itself is the simplest general probabilistic framework for analysis of worthiness or else of SETI projects. Many criticisms have been raised of the Drake equation (e.g., Walters et al. 1980; Wallenhorst 1981; Ćirković 2004; Burchell 2006), accompanied by suggestions of modification, but the key problem remained: overall, the level of astrobiological numerical modeling has remained non-existent to very low, so there has been no viable alternative to the crudeness of the Drake equation. With vastly widened spectrum of astrobiological research in the last decade and a half, it seems appropriate to overcome this deficiency and offer a new probabilistic framework in this area. For a recent attempt along these lines see Maccone (2010, 2012). The PCA formalism we develop here has the same essential form as the Drake equation: it uses a list of input probabilities in order to generate a global conclusion about the number and density of plausible observational SETI targets. However, it adds much more information and can incorporate many additional phenomena, like biotic feedbacks, interstellar panspermia, etc. One possible application of the PCA formalism is in numerical modeling of Fermi’s Paradox (Fig. 1), to which we shall return later.

Fig. 1
figure 1

A scheme for modeling of the Galactic astrobiological evolution, with a particular goal of resolving Fermi’s Paradox in a specific quantitative manner. Astrophysical model of the Milky Way, as well as a particular choice of transition probabilities are input data for the PCA kernel generating random possible astrobiological histories, which can subsequently be tested for any chosen boundary conditions, including those following from Fermi’s Paradox. Other specific versions of the same philosophical approach are possible

Practicality in Parallelization

Astrophysical numerical models are usually quite expensive in terms of CPU time. Efficient parallelization is, therefore, not a luxury but a necessity. CA models are particularly suitable for this, since the non-parallelizable fraction (instructions dealing with the state change of a single cell and its immediate environment) is a very small part of the whole, and hence, according to Amdahl’s law and its modern multicore versions (e.g., Hill and Marty 2008), the net speed gain is large.

Probabilistic Model of the GHZ

The basic quantity, associated with the cell state in our PCA models, is astrobiological complexity. It need not only describe the complexity of life itself but additionally the amount of life-friendly departure from the simple high temperature/entropy mixture of chemical elements commonly found in stars. The stars can be considered as objects with zero astrobiological complexity, while e.g., molecular clouds (hosts of complex organic molecules) can be assigned an astrobiological complexity slightly higher than zero. Further up this astrobiological “entropy” scale, when it comes to planets, there is increasing importance in considering the environment where the planet resides and not just its intrinsic chemical composition. The sites with the highest astrobiological complexity are life-bearing and further quantified by the complexity of the life they host (Fig. 2).

Fig. 2
figure 2

Steps in our PCA model reflecting major astrobiological stages for evolution of each cell. In this scheme we have neglected the possibility of interstellar panspermia, while the possibility of panspermia within the same planetary system is reflected in the increase in weight of each particular cell

We consider a probabilistic cellular automaton where cells of four types occupy the sites of a regular square lattice of dimensionality D = 2. Representing the Galaxy, especially the astrobiologically interesting thin disk component, by a planar system is physically justified (see e.g., Binney and Tremaine 1987), and offers significant computational advantages over the realistic D = 3 equivalent models; we shall return to the relaxation of this assumption in the final section. We model GHZ as the annular ring between R inn = 6 kpc and R out = 10 kpc. We use the absorbing boundary conditions at the boundaries, which is rather obvious for modeling sites with simple lifeforms and remains a good starting approximation for other cases. For the purpose of simplification in our model development we have adopted a discrete four state scale (for the cell at position i, j):

$$ \sigma (i,j) = \left\{ \begin{array} {r@{\quad:\quad}l} 0 & \textrm{no life}\\ 1 & \textrm{simple life} \\ 2 & \textrm{complex life} \\ 3 & \textrm{technological civilization (TC)} \\ \end{array} \right. . $$
(1)

The states and their considered transitions are shown schematically in Fig. 2. Of course, this is a very coarse representation of the astrobiological complexity. The major reason we believe it to be a good start for model-building is that these scale points characterize the only observed evolution of life, deduced from the terrestrial fossil record. The state σ = 1 is, for instance, exemplified by terrestrial prokaryotes and archea, while σ = 2 is exemplified by complex metazoans. The major model unknowns, i.e., the transition rules that model the emergence and evolution of cells on the CA lattice are implemented via probability matrix (\(\hat{P}\)). The relevant biological timescales are directly modeled as the elements of \(\hat{P}\). With this approach various evolutionary scenarios can be easily implemented by just changing the probability matrix with no changes in model mechanics. We set the time step to be equivalent to 106 yrs (1 Myr).

Elements of:

$$ \hat{P}^\mathrm{t}=\left( \begin{array}{ccccc} P_{0,0}^\mathrm{t} &P_{0,1}^\mathrm{t}& \hdots &P_{0,m-1}^\mathrm{t}\\ P_{1,0}^\mathrm{t} &P_{1,1}^\mathrm{t} &\hdots &P_{1,m-1}^\mathrm{t}\\ \vdots\\ P_{m-1,0}^\mathrm{t} &P_{m-1,1}^\mathrm{t} &\hdots &P_{m-1,m-1}^\mathrm{t}\\ \end{array} \right) $$
(2)

are indicative of possible cell transitions for the m-state (states range from 0 to m-1) PCA. In general, the state of the cell in the next time step will result from the temporal evolution of the cell itself (intrinsic evolution), influence of the surrounding cells (local forced evolution) and Galactic environment (externally forced evolution). For the above scaling the full implementation of the transition probabilities can be achieved with 4×4×6 matrixFootnote 2 (\(\hat{P}_\mathrm{ijk}\), where each k-part of \(\hat{P}_\mathrm{ijk}\) is represented by one matrix of \(\hat{P}^\mathrm{t}\) type). The k = 0 part of \(\hat{P}\) are ij transitions probabilities for intrinsic evolution while the k = 5 part are externally forced transition probabilities. The rest of the matrix \(\hat{P}\) describe the forced evolutionary influence of the surrounding cell in state k − 1 on the ij transition for the cell in question. Once developed PCA kernel is thus a highly adaptable platform for modeling different hypothesis by just changing the elements of \(\hat{P}_\mathrm{ijk}\) as input parameters. There are 84 transition probabilities in total but most of them are likely to be of no practical importance (e.g., colonization of a planet by complex life—state 2) or technically redundant (e.g., all probabilities with i = j).

With the previously considered argument for discrete matter distribution, it is likely that the majority of forced probabilities can be disregarded. If we are to consider a cell on the PCA lattice as a planetary system, truly significant effects are likely to come from the P 034, P 134 and P 234 elements, that are indicatives of a TC colonizing the adjacent sites. Also, the panspermia probabilities (P 012, P 013 and P 014), could be of some importance in denser parts of the Galaxy, where the ratio of the average distance between the adjacent planetary systems and an average planetary system size is reduced. However, this is highly questionable since the more populated parts of space experience the greater dynamical instability which could seriously affect the habitability of the comprised systems. The externally forced probabilities P 105, P 205, P 305, P 215, P 315 and P 325 are likely to be of greater importance. They are indicatives of the global Galactic regulation mechanism (gamma-ray bursts, supernovae, collisions, etc.) that are dependent on global Galactic parameters (mainly star formation rate and matter distribution) and can significantly alter the evolution of life. However, these probabilities are still being strongly debated and their influence on potential biospheres is somewhat controversial. With the aforementioned probabilities being of possible significance, the intrinsic probabilities are likely the most important since they reflect the internal conditions in planetary systems and on the planets themselves. Table 1 lists the probabilities of possible importance with a short description.

Table 1 A list of significant probability matrix elements with a short description

For some of the probabilities in the model, a simple generalization of the known terrestrial conditions is possible. In particular, this is the case with parameters P 010, P 120, and P 230. The studies of the Earth’s fossil record have established the following timescales of 1 Gyr, 3 Gyr and 600 Myr, respectively (actually these prototype values are somewhat more conservative than those taken directly from the terrestrial record, since both simple life and observers have appeared more quickly on Earth). Despite numerous past debates, there is still no consensus about the influence of extinction events on the overall evolution of the terrestrial biosphere (Gould’s “third tier of evolution”, Gould (1985)). Even the biggest known extinctions during the Phanerozoic eon did not degrade the astrobiological complexity of the terrestrial biosphere to the stage preceding the Cambrian explosion. In fact, these events could might as well act as a “evolutionary pump” because they have opened new ecological niches to a certain species (Ward and Brownlee 2000); mammals experienced rapid advance on account of the extinction of dinosaurs, after the K-T event. At present, only vague estimates of the relevant probabilities/timescales can be used (Table 2).

Table 2 Fiducial values of transition timescales τ ijk (centroid values of Gaussian distributions from which the values in actual simulations are taken) corresponding to input value transition probabilities P ijk

The other group of input probabilities (comprised of remaining \(\hat{P}\) elements in Table 1) is not known empirically, even for the terrestrial case (one is tempted to state: fortunately enough). In particular, we do not know the probability of complex life on Earth going extinct in the next Myr—although we are justifiably curious to get at least a vague estimate of that particular parameter. Variation of this input parameter makes for one of the most interesting applications of the presented model in the future; for now, we have used fiducial values inferred from the analyzes of Rees (2003) and Bostrom and Ćirković (2008).

Input distribution of Earthlike planet formation rate is given by the seminal paper of Lineweaver (2001). We use the model of star-formation history of our Galaxy published by Rocha-Pinto et al. (2000a, b). This is more complex than the usually assumed quasi-exponential decay form of star-formation density, but fits much better to observational data on the age of populations, chemical evolution, etc. We employ this form of star formation history as forcing the evolution of Type II supernovae and gamma-ray bursts (astrobiological “reset” events; the choice of resets is described in detail in Vukotić and Ćirković 2008; Vukotić 2010).

After running n = 10 Monte Carlo simulations with synchronized update at the spatial resolution of R = 100 cells kpc − 1, we analyze ensemble-averaged results. To get an overall picture of the evolution of the system, we calculate the evolution of masses in each state of our PCA:

$$ M_{\sigma} (t) = \langle \sum\limits_{i,j} \delta_{c(i,j) \sigma} (t)\rangle, $$
(3)

where

$$ \delta_{c(i,j) \sigma} \equiv \left\{ \begin{array} {r@{\quad:\quad}l} 0 & c(i,j) \neq \sigma \\ 1 & c(i,j) = \sigma \\ \end{array} \right. , $$
(4)

is the Kronecker delta. This mass value counts different σ states at each individual step t and the average is taken over the number of simulation runs. The results are shown in Fig. 3 averaged over N = 10 runs, plotted against the age of Galactic thin disk. We notice strongly nonlinear evolution, as well as the numerical predominance of σ = 1 cells at late times—which can be construed as a support for the “rare Earth” hypothesis of Ward and Brownlee (2000) (see also Forgan and Rice 2010). At t ∼ 7000 (corresponding to ∼ 3 Gyr before the present) first TCs appear in significant number, and the number of such σ = 3 sites increases subsequently (non-monotonically, though and not conforming to simple scaling relationship occasionally suggested in the literature, e.g., in Fogg 1987; Bezsudnov and Snarskii 2010). An example of the distribution of σ = 3 sites is shown in Fig. 4. (Lots of further work needs to be done in order to highlight the sensitivity of these results on individual input probabilities. A numerical error at an early stage of the PCA kernel testing overestimated the probability P 230 by about half of an order of magnitude in comparison to the terrestrial value used here, accidentally enabled us to test the sensitivity of the clustering analysis and V/V 0, with encouraging results.)

Fig. 3
figure 3

Evolution of populations of sites in various states i = 0, 1, 2, 3 (color-coded) in the PCA model of GHZ, averaged over N = 10 simulation runs. Timescale represents the age of the thin disk of the Milky Way, corrected for the first 3.8 Gyr lacking sufficient metallicity (Lineweaver 2001)

Fig. 4
figure 4

An example of clusters formed in the coarse-grained PCA model of the Galactic Habitable Zone; scales are in kpc, and the snapshot corresponds to “late” epoch

In order to proceed with the analysis of clustering of such sites, which is of obvious interest for practical SETI considerations, we develop a polygonal representation of clusters, shown through an example in Fig. 5. Obviously (as in all forms of percolation problem), clusters are porous structures, which may contain many areas of persistence, suggested by Kinouchi (2001) as resolution of Fermi’s Paradox. We test this by investigating the fraction of cells inside the polygonal representation that is occupied by the cluster at fiducial “late” epoch of t = 9,500, by which we have at least one example of spanning cluster in each simulation run, measured against the total number of occupied cells within the polygon. (We use a specific restricted sense of spanning cluster as the one which spans the entire GHZ, that is has radial size of at least R out − R inn, which seems appropriate for this particular form of the percolation problem.) The results are shown in Fig. 6 and are consistent with the distribution usually obtained in clustering analyses of percolation in other contexts. This serves as an auxiliary way of testing the proposed algorithm.

Fig. 5
figure 5

An example of the polygonal algorithm used for measuring the span of clusters in the simulation

Fig. 6
figure 6

The distribution of mass filling factors of clusters of state σ = 3 (“technological civilizations”) at epoch t = 9,500, measured by the polygonal algorithm illustrated in Fig. 5. Large clusters will tend to have filling factors of ≃ 50%, leaving many sites for continuation of astrobiological evolution within their spans

As shown in Fig. 7, the set of clusters at the same fiducial late epoch roughly obeys the scaling relation

$$ N(>S) \propto S^{-\alpha}, $$
(5)

where N( > S) is the number of clusters with more than S cluster cells (“mass” of the cluster). The best-fit mass index is given as α = 1.72 ±0.01. Such behavior is remarkable in view of the highly non-uniform underlying distribution of ages of sites, represented by the planetary formation rate data and the star-formation rate data influencing the distribution of the reset events. The temporal dependence of this mass index in the course of the Galactic history is shown in Fig. 10, weak increase in the last Gyr probably reflecting a sort of “natural selection” favoring large clusters. While this might be an important piece of information in debates surrounding, for example, the famous Kardashev’s classification of hypothetical Galactic civilizations (Kardashev 1964), much further work is required in order to better understand this behavior.

Fig. 7
figure 7

Mass index α = 1.72 ±0.01 of the same set of clusters as in Fig. 6

Finally, we need to consider the distribution of sizes of σ = 3 clusters, shown in Fig. 8. Our results strongly confirm the intuitive view that this distribution is strongly time-dependent, on which most of the construals of Fermi’s Paradox are based. For the sample of chosen results—clusters at t = 9,500—we notice that the highest concentration of clusters is at ≃ 0.1 kpc, corresponding to small-to-medium sized interstellar civilizations (for our, rather conservative, choice of the colonization probabilities in the input probability matrix), while the number of truly large clusters (equal or larger to the size of GHZ itself) is marginal (Figs. 9 and 10).

Fig. 8
figure 8

Lengthscale distribution of σ = 3 clusters as estimated by the polygonal method (see Fig. 5). Sparsely populated upper-right part of the diagram represents what can be called “percolation” solution to Fermi’s Paradox, as suggested by Landis and Kinouchi (within the framework of our neocatastrophic model, see text). Vertical line denotes the radial size of GHZ, i.e., the quantity R out − R inn

Fig. 9
figure 9

The evolution of critical exponent describing the set of σ = 3 clusters with time in the “late” epochs of the history of astrobiological complexity of the Milky Way

Fig. 10
figure 10

Behaviour of the mass exponent α for clusters of state 3 (“advanced civilizations”) shown during the last Gyr at epochs separated by 100 Myr. Although within the probably underestimated uncertainties, the rising trend is explicable as those civilizations which survive tend to expand and add power to the high-mass end part of the distribution

In Fig. 11 we present the dependence of the relative occupied volume in GHZ upon the time elapsed since the formation of the Milky Way thin disk, averaged over 10 simulation runs for various (color-coded) values of characteristic timescales. This quantity, conventionally labeled V/V 0 (where V is the occupied volume, interpreted as the volume in which the presence of a technological civilization is easily detectable) has occasionally been used in SETI studies as a measure of ascent of technological civilization on Kardashev’s ladder. Here we have used V 0 as the volume of GHZ in our D = 2 model, suggesting that we are in fact overestimating V/V 0, since it is reasonable to assume that the expansion of technological civilizations is not constrained in any significant manner by the boundaries of GHZ. The important conclusion here is that although we have started with “Copernican” input matrix of probabilities, we still obtain V/V 0 < < 1, which is in accordance with the conventional interpretation of our observations related to Fermi’s Paradox. Thus, our PCA model seems to support the idea that we can explain Fermi’s Paradox in the framework of such neocatastrophic discrete model. Although we have not run our simulation far enough in the future (as defined by our chosen timescale) to reach V/V 0 ≃ 1, it seems that the increase is much shallower than in the models of Bezsudnov and Snarskii (2010).

Fig. 11
figure 11

The average value of V/V 0 in the Milky Way during last 2 Gyr in our PCA model of GHZ for t 134 = t 234 = 110 Myr. It is likely that we have overestimated V/V 0 here and that this plot represents only the lower limit, since the expanding wavefront of σ = 3 sites is likely to encompass sites outside of GHZ

Fermi’s Paradox as a Boundary Condition

The question of the astrobiological “landscape” of Galactic evolution can be regarded as a particular instance of a (not necessarily well-posed) boundary value problem. While we do not understand the laws of local “astrobiological dynamics”, we can use boundary conditions, together with the assumption of the local terrestrial example being randomly chosen from the (unknown) distribution to constrain the space of possible landscapes. Some of the boundary conditions are those we have used in building of our PCA model: the age of the Galactic thin disk, the boundaries of GHZ, the statistical distribution of reset events. However, the most controversial one comes from Fermi’s Paradox (Brin 1983; Webb 2002; Ćirković 2009).

In other words, can the famous lunch time question of Enrico Fermi, “Where are they?”, be helpful when it comes to answering the question, “Where are we?”—In what kind of neighborhood do we exist? Depending on the aspirations of our possible “fellow Galactizens” and the chances for their existence there are two probable scenarios. For the purpose of this paper, we will simply called them soft and hard. The hard scenario puts more weight on Fermi’s paradox as a boundary condition, since it is assumed that we have not yet observed an alien civilization simply because there are no such civilizations capable (or willing, see Ćirković 2009) of interstellar travel and communication. On the soft side, we can think of ourselves as being missed, because we are residing in a passive pocket of the Galaxy that is not near to any of the “highways” used by other civilizations (Kinouchi 2001), or we are deprived of contact of any kind (the “Zoo hypothesis” of Ball 1973). There are numerous assumptions that can be made about the nature of Fermi’s paradox—Smith (2009) concludes that even after five decades there is still no way to find the “right” values of the variables in the Drake equation, though it is controversial for other reasons as well (see also Ćirković 2004).

The hard version of the paradox will constrain the probability matrix phase space that is indicative of sparse contact chances throughout the Galactic history, while the soft version allows for the phase space to be somewhat larger, meaning that there were civilization contacts in the Galaxy but we just did not experience them for various possible reasons. Obviously, the soft version of the paradox acts as a more loose boundary condition than the hard version. The porosity of large σ = 3 clusters in our simulations (Fig. 6), coupled with low \(\langle V/V_0 \rangle\) (Fig. 11), demonstrates how this still seems acceptable within the “Copernican” framework, thus essentially confirming the conclusions of Landis (1998) and Kinouchi (2001), but with addition of catastrophic reset events. The downside of this is that one does not take into account the fact that at least some of the manifestations of advanced technological civilizations would be observable over large interstellar distances (e.g., Freitas 1985; Ćirković and Bradbury 2006). More research will be necessary in order to quantify the conditions for such “Dysonian” approach to SETI (Dyson 1960; Sagan and Walker 1966; Carrigan 2009).

Clearly, the issue will need to be settled by constructing a whole series of models along the lines of our simple PCA, probing large volumes of the input probability matrix space (as shown schematically in Fig. 1). Such computationally more challenging programme will enable precise determination of those chunks of parameter space consistent with a particularly chosen form of Fermi’s Paradox (for example, the statement that there are no technological civilizations 1 Myr or more older than us in the sphere of 100 pc radius around the Sun, or a similar statement). This approach may be used as complementary to the attempts to build a sounder theoretical basis for SETI studies (Maccone 2010).

Discussion and Future Plans

We have analyzed a prototype 4-state astrobiological PCA whose boundary conditions are derived from our understanding of astrophysics and astrochemistry of the Milky Way, and whose dynamical rules are inferred from our understanding of the terrestrial biological evolution. It clearly belongs to Wolfram’s third class of cellular automata (Wolfram 1983), being able to generate arbitrarily complex aperiodic states from a simple (in our case even trivial) initial state. It is capable of generating many possible astrobiological histories of our Galaxy, probing in this way the huge parameter space involved. The main advantage of the present approach is that the question “How many probable solutions are there?” becomes for the first time numerically tractable. By simply changing the values of \(\hat{P}\) elements over the part of the phase space of interest, we can model the resulting astrobiological histories that have lead to the present state. These parts of the phase space can be further interpreted and connected with astrobiologically relevant processes and events.

Even with the more restrictive hard version of Fermi’s Paradox there is still a great deal of \(\hat{P}\) phase space to be speculated about and included in the models. It would probably be best to start with the smallest possible number of parameters. The rest of the \(\hat{P}\) elements can be considered in subsequent phases of the iterative process in accordance with the results of preceding simulations and their analysis. Instead of implementing all elements listed in Table 1, we can restrict ourselves to implementing some of them or to subsume a group of parameters into a single parameter.

A task of investigating the sensitivity on input parameters remains; we can be the most comfortable with including the elements related to at least an order of magnitude known timescales (P 010, P 120, P 230). With the arguments of the discrete matter distribution we can implement the forced evolution with allowing only colonization by a neighboring cell TC, such that P 034 = P 134 = P 234. Using the fact that, once developed, complex life on Earth did not perish despite a few major extinction events, it is probably justifiable to approximate with P 105 = P 205 = P 305 and with P 100 = P 200 = P 300 (or perhaps separate P 300 from some possible TC induced reasons, see Bostrom and Ćirković (2008). Namely, once life reached the stage of advanced civilization it is reasonable to assume that it cannot be easily degraded—i.e., such a degradation could be possibly achieved with the sterilizing disaster that will completely deprive the planet of living organisms. With the exception of P 010, P 120 and P 230 there are three or four (with separate P 300) adjustable parameters. By varying these parameters in our future simulations, we can hopefully restrict their values to a narrower range using Fermi’s paradox as a boundary condition (cf. Duric and Field 2003). Then a model could be further refined by separating the equalities mentioned above.

Considering the vast uncertainties that characterize research of this kind (some of them probably coming from the implicit Copernican assumptions), despite the advantages of the approach presented in this paper, we think that major improvements can be made with the incoming new data from future multidisciplinary studies and space missions. In future work we are planning to present and analyze the results of a similar PCA model with a more detailed probability matrix, as well as higher spatial resolution using massive parallel computing. Beside these, there are several phenomenological improvements which seem to hold some prospects for future work, deserving to be mentioned here.

Further improvement of boundary conditions can be implemented with including colonization by TCs of sites beyond the boundaries of GHZ (in particular the large volume beyond R out can be interesting for those advanced TCs motivated primarily by optimization criteria, Ćirković and Bradbury 2006). An important extension of the present model would be incorporation of interstellar panspermia: the possibility of transfer of simple lifeforms (commonly envisaged in form of extremophiles of Bacteria or Archaea domains of life) from one planetary system to another. Several viable theories have been proposed recently (Napier 2004, 2007; Wallis and Wickramasinghe 2004), whose common property is that interstellar panspermia is very slow process. Thus, we have not used it in obtaining the results presented here, but the generalization is rather straightforward: characteristic timescales are ∼ 109 yrs for transfer between neighbouring planetary systems at the Solar galactocentric distance and correspondingly larger for more distant systems, following roughly the random-walk arrival times. This translates, in an ideal PCA with 1 pc-sized cells, into interaction between neighbours, weighted by the mean stellar density in our Milky Way model in the same way as the density of planetary systems (as well as the density of supernovae/gamma-ray bursts) is weighted. In more coarse-grained simulations, panspermia would increase the biogenesis potential of a single cell and possibly act to reduce timescale for transition to complex life in itFootnote 3 or to increase the sterilization timescale (making simple life more persistent in a single cell of the automaton). While it is hard to gauge the overall impact of panspermia on the Milky Way astrobiological landscape, it is conceivable that locally—including the neighbourhood of the Solar System—and, in the long run, it may make a difference; only detailed future work can resolve the issue. Beside increasing spatial, one may strive to increase temporal resolution as well, in particular when it comes to modeling of TC clustering.

An additional refinement left for future models is taking into account the differential rotation of the Galaxy. Such rotation will cause continuous deformation of clusters on timescales ∼ 108 years and larger. Since the kinematics of the Milky Way is rather well-understood, it is conceptually straightforward to apply this to our GHZ model, although the computational implementation is, according to preliminary considerations, rather expensive. It seems that this effect might be of some importance in the late epochs for TC clusters of large span. It has been intuitively suggested as a difficulty for those answers to Fermi’s paradox like Fogg’s “Interdict Hypothesis” relying on large- scale uniformity of behavior of TCs (Fogg 1987); further development of quantitative GHZ models will present an opportunity to check this intuition numerically.

Finally, an obvious further step is building D = 3 PCA models, reflecting the vertical stratification of Galactic matter, as well as some possible additional effects on local biospheres, e.g., Galactic plane/spiral arms’ crossings and their ecological consequences (Leitch and Vasisht 1998; Gies and Helsel 2005). This will add a new layer of complexity and re-emphasize the degree in which local biological conditions are embedded in wider and richer astrophysical surroundings.

In conclusion, PCA seems to be a fruitful approach for the quantitative approach to astrobiology in the Milky Way context. Although still plagued by many uncertainties, quantitative astrobiology has wide perspectives, utilizing the best of computational physics of today, together with continuously updated observational data from the new generation of astronomical instruments. In particular, it carries the prospect of at least better framing—if not answering—perhaps the most intriguing question in all science, “Are we alone?”