The dynamics of power laws: Fitness and aging in preferential attachment trees

Continuous-time branching processes describe the evolution of a population whose individuals generate a random number of children according to a birth process. Such branching processes can be used to understand preferential attachment models in which the birth rates are linear functions. We are motivated by citation networks, where power-law citation counts are observed as well as aging in the citation patterns. To model this, we introduce fitness and age-dependence in these birth processes. The multiplicative fitness moderates the rate at which children are born, while the aging is integrable, so that individuals receives a finite number of children in their lifetime. We show the existence of a limiting degree distribution for such processes. In the preferential attachment case, where fitness and aging are absent, this limiting degree distribution is known to have power-law tails. We show that the limiting degree distribution has exponential tails for bounded fitnesses in the presence of integrable aging, while the power-law tail is restored when integrable aging is combined with fitness with unbounded support with at most exponential tails. In the absence of integrable aging, such processes are explosive.


Introduction
Preferential attachment models (PAMs) aim to describe dynamical networks. As for many realworld networks, PAMs present power-law degree distributions that arise directly from the dynamics, and are not artificially imposed as, for instance, in configuration models or inhomogeneous random graphs.
PAMs were first proposed by Albert and Barabási [1], who defined a random graph model where, at every discrete time step, a new vertex is added with one or more edges, that are attached to existing vertices with probability proportional to the degrees, i.e., P (vertex (n + 1) is attached to vertex i | graph at time n) ∝ D i (n), where D i (n) denotes the degree of a vertex i ∈ {1, . . . , n} = [n] at time n. In general, the dependence of the attachment probabilities on the degree can be through a preferential attachment function of the degree, also called preferential attachment weights. Such models are called PAMs with general weight function. According to the asymptotics of the weight function w(·), the limiting degree distribution of the graph can behave rather differently. There is an enormous body of literature showing that PAMs present power-law decay in the limiting degree distribution precisely when the weight function is affine, i.e., it is a constant plus a linear function. See e.g., [14,Chapter 8] and the references therein. In addition, these models show the so-called old-get-richer effect, meaning that the vertices of highest degrees are the vertices present early in the network formation. An extension of this model is called preferential attachment models with a random number of edges [8], where new vertices are added to the graph with a different number of edges according to a fixed distribution, and again power-law degree sequences arise. A generalization that also gives younger 1 arXiv:1703.05943v2 [math.PR] 9 Nov 2017 the dynamics of power laws vertices the chance to have high degrees is given by PAMs with fitness as studied in [10], [9]. Borgs et al. [6] present a complete description of the limiting degree distribution of such models, with different regimes according to the distribution of the fitness, using generalized Polyá's urns. An interesting variant of a multi-type PAM is investigated in [21], where the author consider PAMs where fitnesses are not i.i.d. across the vertices, but they are sampled according to distributions depending on the fitnesses of the ancestors. This work is motivated by citation networks, where vertices denote papers and the directed edges correspond to citations. For such networks, other models using preferential attachment schemes and adaptations of them have been proposed mainly in the physics literature. Aging effects, i.e., considering the age of a vertex in its likelihood to obtain children, have been extensively considered as the starting point to investigate their dynamics [25], [26], [11], [12], [7]. Here the idea is that old papers are less likely to be cited than new papers. Such aging has been observed in many citation network datasets and makes PAMs with weight functions depending only on the degree ill-suited for them. As mentioned above, such models could more aptly be called old-get-richer models, i.e., in general old vertices have the highest degrees. In citation networks, instead, papers with many citations appear all the time. Barabási, Wang and Song [24] investigate a model that incorporates these effects. On the basis of empirical data, they suggest a model where the aging function follows a lognormal distribution with paper-dependent parameters, and the preferential attachment function is the identity. In [24], the fitness function is estimated rather than the more classical approach where it is taken to be i.i.d.. Hazoglou, Kulkarni, Skiena Dill in [13] propose a similar dynamics for citation evolution , but only considering the presence of aging and cumulative advantage without fitness.
Tree models, arising when new vertices are added with only one edge, have been analyzed in [2], [3], [23], [22] and lead to continuous-time branching processes (CTBP). The degree distributions in tree models show identical qualitative behavior as for the non-tree setting, while their analysis is much simpler. Motivated by this and the wish to understand the qualitative behavior of PAMs with general aging and fitness, the starting point of our model is the CTBP or tree setting. Such processes have been intensively studied, due to their applications in other fields, such as biology. Detailed and rigorous analysis of CTBPs can be found in [4], [15], [18], [23], [2], [3], [5]. A CTBP consists of individuals, whose children are born according to certain birth processes, these processes being i.i.d. across the individuals in the population. The birth processes (V t ) t≥0 are defined in term of point or jump processes on N [15], [18], where the birth times of children are the jump times of the process, and the number of children of an individual at time t ∈ R + is given by V t .
In the literature, the CTBPs are used as a technical tool to study PAMs [3], [23], [21]. Indeed, 1980 1985 1990 1995    the CTBP at the nth birth time follows the same law as the PAM consisting of n vertices. In [3], [23], the authors prove an embedding theorem between branching processes and preferential attachment trees, and give a description of the degree distribution in terms of the asymptotic behavior of the weight function w(·). In particular, a power-law degree distribution is present in the case of (asymptotically) linear weight functions [22]. In the sub-linear case, instead, the degree distribution is stretched-exponential, while in the super-linear case it collapses, in the sense that one of the first vertices will receive all the incoming new edges after a certain step [19]. Due to the apparent exponential growth of the number of nodes in citation networks, we view the continuous-time process as the real network, which deviates from the usual perspective. Because of its motivating role in this paper, let us now discuss the empirical properties of citation networks in detail.

Citation networks data.
Let us now discuss the empirical properties of citation networks in more detail. We analyze the Web Of Science database, focusing on three different fields of science: Probability and Statistics (PS), Electrical Engineering (EE) and Biotechnology and Applied Microbiology (BT). We first point out some characteristics of citation networks that we wish to replicate in our models. Real-world citation networks possess five main characteristics: (1) In Figure 1, we see that the number of scientific publications grows exponentially in time.
While this is quite prominent in the data, it is unclear how this exponential growth arises. This could either be due to the fact that the number of journals that are listed in Web Of Science grows over time, or that journals contain more and more papers. (2) In Figure 2, we notice that these datasets have empirical power-law citation distributions.
Thus, most papers attract few citations, but the amount of variability in the number of cita-  tions is rather substantial. We are also interested in the dynamics of the citation distribution of the papers published in a given year, as time proceeds. This can be observed in Figure 3. We see a dynamical power law, meaning that at any time the degree distribution is close to a power law, but the exponent changes over time (and in fact decreases, which corresponds to heavier tails). When time grows quite large, the power law approaches a fixed value.
(3) In Figure 4, we see that the majority of papers stop receiving citations after some time, while few others keep being cited for longer times. This inhomogeneity in the evolution of node degrees is not present in classical PAMs, where the degree of every fixed vertex grows as a positive power of the graph size. Figure 4 shows that the number of citations of papers published in the same year can be rather different, and the majority of papers actually stop receiving citations quite soon. In particular, after a first increase, the average increment of citations decreases over time (see Figure 5). We observe a difference in this aging effect between the PS dataset and the other two datasets, due to the fact that in PS, scientists tend to cite older papers than in EE or BT. Nevertheless the average increment of citations received by papers in different years tends to decrease over time for all three datasets.
(4) Figure 6 shows the linear dependence between the past number of citations of a paper and the future ones. Each plot represents the average number of citations received by papers published in 1984 in the years 1993, 2006 and 2013 according to the initial number of citations in the same year. At least for low values of the starting number of citations, we see that the average number of citations received during a year grows linearly. This suggests that the attractiveness of a paper depends on the past number of citations through an affine function.
(5) A last characteristic that we observe is the lognormal distribution of the age of cited papers. In Figure 7, we plot the distribution of cited papers, looking at references made by papers  in different years. We have used a 20 years time window in order to compare different citing years. Notice that this lognormal distribution seems to be very similar within different years, and the shape is similar over different fields.
Let us now explain how we translate the above empirical characteristics into our model. First, CTBPs grow exponentially over time, as observed in citation networks. Secondly, the aging present in citation networks, as seen both in Figures 4 and 5, suggests that citation rates become smaller for large times, in such a way that typical papers stop receiving citations at some (random) point in time. The hardest characteristic to explain is the power-law degree sequence. For this, we note that citations of papers are influenced by many external factors that affect the attractiveness of papers (the journal, the authors, the topic,. . . ). Since this cannot be quantified explicitly, we introduce another source of randomness in our birth processes that we call fitness. This appears in the form of multiplicative factors of the attractiveness of a paper, and for lack of better knowledge, we take these factors to be i.i.d. across papers, as often assumed in the literature. These assumptions are similar in spirit as the ones by Barabási et al. [24], which were also motivated by citation data, and we formalize and extend their results considerably. In particular, we give the precise conditions under which power-law citation counts are observed in this model.
Our main goal is to define CTBPs with both aging as well as random fitness that keep having a power-law decay in the in-degree distribution. Before discussing our model in detail in Section 2, we present the heuristic ideas behind it as well as the main results of this paper.
1.2 Our main contribution. The crucial point of this work is to show that it is possible to obtain power-law degree distributions in preferential attachment trees where the birth process is not just depending on an asymptotically linear weight sequence, in the presence of integrable aging and fitness. Let us now briefly explain how these two effects change the behavior of the degree distribution.
Integrable aging and affine preferential attachment without fitness. In the presence of aging but without fitness, we show that the aging effect substantially slows down the birth process. In the case of affine weights, aging destroys the power-law of the stationary regime, generating a limiting distribution that consists of a power law with exponential truncation. We prove this under reasonable conditions on the underlying aging function (see Lemma 5.1).
Integrable aging and super-linear preferential attachment without fitness. Since the aging destroys the power-law of the affine PA case, it is natural to ask whether the combination of integrable aging and super-linear weights restores the power-law limiting degree distribution. Theorem 2.3 states that this is not the case, as super-linear weights imply explosiveness of the branching process, which is clearly unrealistic in the setting of citation networks (here, we call a weight sequence k → f k super-linear when k≥1 1/f k < ∞). This result is quite general, because it holds for any integrable aging function. Due to this, it is impossible to obtain powerlaws from super-linear preferential attachment weights. This suggests that (apart from slowlyvarying functions), affine preferential attachment weights have the strongest possible growth, while maintaining exponential (and thus, in particular, non-explosive) growth.
Integrable aging and affine preferential attachment with unbounded fitness. In the case of aging and fitness, the asymptotic behavior of the limiting degree distribution is rather involved. We estimate the asymptotic decay of the limiting degree distribution with affine weights in Proposition 5.5. With the example fitness classes analyzed in Section 5.3, we prove that powerlaw tails are possible in the setting of aging and fitness, at least when the fitness has roughly exponential tail. So far, PAMs with fitness required the support of the fitness distribution to be bounded. The addition of aging allows the support of the fitness distribution to be unbounded, a feature that seems reasonable to us in the context of citation networks. Indeed, the relative attractivity of one paper compared to another one can be enormous, which is inconsistent with a bounded fitness distribution. While we do not know precisely what the necessary and sufficient conditions are on the aging and the fitness distribution to assure a power-law degree distribution, our results suggests that affine PA weights with integrable aging and fitnesses with at most an exponential tail in general do so, a feature that was not observed before.
Dynamical power laws. In the case of fitness with exponential tails, we further observe that the number of citations of a paper of age t has a power-law distribution with an exponent that depends on t. We call this a dynamical power law, and it is a possible explanation of the dynamical power laws observed in citation data (see Figure 3).
Universality. An interesting and highly relevant observation in this paper is that the limiting degree distribution of preferential attachment trees with aging and fitness shows a high amount of universality. Indeed, for integrable aging functions, the dependence on the precise choice of the aging function seems to be minor, except for the total integral of the aging function. Further, the dependence on fitness is quite robust as well.

Our model and main results
In this paper we introduce the effect of aging and fitness in CTBP populations, giving rise to directed trees. Our model is motivated by the study of citation networks, which can be seen as directed graphs. Trees are the simplest case in which we can see the effects of aging and fitness. Previous work has shown that PAMs can be obtained from PA trees by collapsing, and their general degree structure can be quite well understood from those in trees. For example, PAMs with fixed out-degree m ≥ 2 can be defined through a collapsing procedure, where a vertex in the multigraph is formed by m ∈ N vertices in the tree (see [14,Section 8.2]). In this case, the limiting degree distribution of the PAM preserve the structure of the tree case ( [14,Section 8.4], [5,Section 5.7]). This explains the relevance of the tree case results for the study of the effect of aging and fitness in PAMs. It could be highly interesting to prove this rigorously.
2.1 Our CTBP model. CTBPs represent a population made of individuals producing children independently from each other, according to i.i.d. copies of a birth process on N. We present the general theory of CTBPs in Section 3, where we define such processes in detail and we refer to general results that are used throughout the paper. In general, considering a birth process (V t ) t≥0 on N, every individual in the population has an i.i.d. copy of the process (V t ) t≥0 , and the number of children of individual x at time t is given by the value of the process V x t . We consider birth processes defined by a sequence of weights (f k ) k∈N describing the birth rates. Here, the time between the kth and the (k + 1)st jump is exponentially distributed with parameter f k . The behavior of the whole population is determined by this sequence.
The fundamental theorem for the CTBPs that we study is Theorem 3.10 quoted in Section 3. It states that, under some hypotheses on the birth process (V t ) t≥0 , the population grows exponentially in time, which nicely fits the exponential growth of scientific publications as indicated in Figure 1. Further, using a so-called random vertex characteristic as introduced in [15], a complete class of properties of the population can be described, such as the fraction of individuals having k children, as we investigate in this paper. The two main properties are stated in Definitions 3.8 and 3.9, and are called supercritical and Malthusian properties. These properties require that there exists a positive value α * such that where T α denotes an exponentially distributed random variable with rate α independent of the process (V t ) t≥0 . The unique value α * that satisfies both conditions is called the Malthusian parameter, and it describes the exponential growth rate of the population size. The aim is to investigate the ratio number of individuals with k children at time t size total population at time t .
According to Theorem 3.10, this ratio converges almost surely to a deterministic limiting value p k . The sequence (p k ) k∈N , which we refer to as the limiting degree distribution of the CTBP (see Definition 3.12), is given by The starting idea of our model of citation networks is that, given the history of the process up to time t, the rate of an individual of age t and k children to generate a new child is Y f k g(t), where f k is a non-decreasing PA function of the degree, g is an integrable function of time, and Y is a positive random variable called fitness. Therefore, the likelihood to generate children increases by having many children and/or a high fitness, while it is reduced by age. Recalling Figure 6, we assume that the PA function f is affine, so f k = ak + b. In terms of a PA scheme, this implies P (a paper cites another with past k citations | past) ≈ where n(k) denotes the number of papers with k past citations, and A is the normalization factor. Such behavior has already been observed by Redner [20] and Barabási et al. [16]). We assume throughout the paper that the aging function g is integrable. In fact, we start by the fact that the age of cited papers is lognormally distributed (recall Figure 7). By normalizing such a distribution by the average increment in the number of citations of papers in the selected time window, we identify a universal function g(t). Such function can be approximated by a lognormal shape of the form for c 1 , c 2 and c 3 field-dependent parameters. In particular, from the procedure used to define g(t), we observe that g(t) ≈ number of references to year t number of papers of age t total number of papers considered total number of references considered , which means in terms of PA mechanisms that P (a paper cites another of age t | past) ≈ where B is the normalization factor, while this time n(t) is the number of paper of age t. This suggests that the citing probability depends on age through a lognormal aging function g(t), which is integrable. This is one of the main assumptions in our model, as we discuss in Section 1.2. It is known from the literature ( [22], [23], [2]) that CTBPs show power-law limiting degree distributions when the infinitesimal rates of jump depend only on a sequence (f k ) k∈N that is asymptotically linear. Our main aim is to investigate whether power-laws can also arise in branching processes that include aging and fitness. The results are organized as follows. In Section 2.2, we discuss the results for CTBPs with aging in the absence of fitness. In Section 2.3, we present the results with aging and fitness. In Section 2.4, we specialize to fitness with distributions with exponential tails, where we show that the limiting degree distribution is a power law with a dynamic power-law exponent.
2.2 Results with aging without fitness. In this section, we focus on aging in PA trees in the absence of fitness. The aging process can then be viewed as a time-changed stationary birth process (see Definition 3.13). A stationary birth process is a stochastic process (V t ) t≥0 such that, for h small enough, In general, we assume that k → f k is increasing. The affine case arises when f k = ak + b with a, b > 0. By our observations in Figure 6, as well as related works ( [20], [16]), the affine case is a reasonable approximation for the attachment rates in citation networks. For a stationary birth process (V t ) t≥0 , under the assumption that it is supercritical and Malthusian, the limiting degree distribution (p k ) k∈N of the corresponding branching process is given by For a more detailed description, we refer to Section 3.2. Branching processes defined by stationary processes (with no aging effect) have a so-called old-get-richer effect. As this is not what we observe in citation networks (recall Figure 4), we want to introduce aging in the reproduction process of individuals. The aging process arises by adding age-dependence in the infinitesimal transition probabilities: Definition 2.1 (Aging birth processes). Consider a non-decreasing PA sequence (f k ) k∈N of positive real numbers and an aging function g : R + → R + . We call a stochastic process (N t ) t≥0 an aging birth process (without fitness) when (1) N 0 = 0, and N t ∈ N for all t ∈ N; (2) N t ≤ N s for every t ≤ s; (3) for fixed k ∈ N and t ≥ 0, as h → 0, Aging processes are time-rescaled versions of the corresponding stationary process defined by the same sequence (f k ) k∈N . In particular, for any t ≥ 0, N t has the same distribution as V G(t) , where G(t) = t 0 g(s)ds. In general, we assume that the aging function is integrable, which means that G(∞) := ∞ 0 g(s)ds < ∞. This implies that the number of children of a single individual in its entire lifetime has distribution V G(∞) , which is finite in expectation. In terms of citation networks, this assumption is reasonable since we do not expect papers to receive an infinite number of citations ever (recall Figure 5). Instead, for the stationary process (V t ) t≥0 in Definition 3.13, we have that P-a.s. V t → ∞, so that also the aging process diverges P-a.s. when G(∞) = ∞. For aging processes, the main result is the following theorem, proven in Section 4. In its statement, we rely on the Laplace transform of a function. For a precise definition of this notion, we refer to Section 3: Theorem 2.2 (Limiting distribution for aging branching processes). Consider an integrable aging function and a PA sequence (f k ) k∈N . Denote the corresponding aging birth process by (N t ) t≥0 . Then, assuming that (N t ) t≥0 is supercritical and Malthusian, the limiting degree distribution of the branching process N defined by the birth process (N t ) t≥0 is given by where α * is the Malthusian parameter of N . Here, the sequence of coefficients (L g (k, α * )) k∈N appearing in (2.3) is given byL where, for h : R + → R, L(h(·))(α) denotes the Laplace transform of h. Further, considering a fixed individual in the branching population, the total number of children in its entire lifetime is distributed as The limiting degree distribution maintains a product structure as in the stationary case (see (2.2) for comparison). Unfortunately, the analytic expression for the probability distribution (p k ) k∈N in (2.3) given by the previous theorem is not explicit. In the stationary case, the form reduces to the simple expression in (2.2).
In general, the asymptotics of the coefficients (L g (k, α * )) k∈N is unclear, since it depends both on the aging function g as well as the PA weight sequence (f k ) k∈N itself in an intricate way. In particular, we have no explicit expression for the ratio in (2.4), except in special cases. In this type of birth process, the cumulative advantage given by (f k ) k∈N and the aging effect given by g cannot be separated from each other.
Numerical examples in Figure 8 show how aging destroys the power-law degree distribution. In each of the two plots, the limiting degree distribution of a stationary process with affine PA weights gives a power-law degree distribution, while the process with two different integrable aging functions does not. In the examples we have used g(t) = e −λt and g(t) = (1 + t) −λ for some λ > 1, and we observe the insensitivity of the limiting degree distribution with respect to g. The distribution the dynamics of power laws given by (2.3) can be seen as the limiting degree distribution of a CTBP defined by preferential attachment weight (f kL g (k, α * )) k∈N . This suggests that f kL g (k, α * ) is not asymptotically linear in k.
In Section A.2, we investigate the two examples in Figure 8, showing that the limiting degree distribution has exponential tails, a fact that we know in general just as an upper bound (see Lemma 5.3).
In order to apply the general CTBP result in Theorem 3.10 below, we need to prove that an aging process (N t ) t≥0 is supercritical and Malthusian. We show in Section 4 that, for an integrable aging function g, the corresponding process is supercritical if and only if (2.5) Condition (2.5) heuristically suggests that the process (N t ) t≥0 has a Malthusian parameter if and only if the expected number of children in the entire lifetime of a fixed individual is larger than one, which seems quite reasonable. In particular, such a result follows from the fact that if g is integrable, then the Laplace transform is always finite for every α > 0. In other words, since N T α * has the same distribution as ]. This implies that G(∞) cannot be too small, as otherwise the Malthusian parameter would not exist, and the CTBP would die out P-a.s.. The aging effect obviously slows down the birth process, and makes the limiting degree distribution have exponential tails for affine preferential attachment weights. One may wonder whether the power-law degree distribution could be restored when (f k ) k∈N grows super-linearly instead. Here, we say that a sequence of weights (f k ) k∈N grows super-linearly when k≥1 1/f k < ∞ (see Definition 3.16). In the super-linear case, however, the branching process is explosive, i.e., for every individual the probability of generating an infinite number of children in finite time is 1. In this situation, the Malthusian parameter does not exist, since the Laplace transform of the process is always infinite. One could ask whether, by using an integrable aging function, this explosive behavior is destroyed. The answer to this question is given by the following theorem: Theorem 2.3 (Explosive aging branching processes for super-linear attachment weights). Consider a stationary process (V t ) t≥0 defined by super-linear PA weights (f k ) k∈N . For any aging function g, the corresponding non-stationary process (N t ) t≥0 is explosive.
The proof of Theorem 2.3 is rather simple, and is given in Section 4.2. We investigate the case of affine PA weights f k = ak + b in more detail in Section 5.1. Under a hypothesis on the regularity of the integrable aging function, in Proposition 5.2, we give the asymptotic behavior of the corresponding limiting degree distribution. In particular, as k → ∞, for some positive constants C 1 , C 2 . The term G(k, g) is a function of k, the aging function g and its derivative. The precise behavior of such term depends crucially on the aging function. Apart from this, we notice that aging generates an exponential term in the distribution, which explains the two examples in Figure 8. In Section A.2, we prove that the two limiting degree distributions in Figure 8 indeed have exponential tails.

2.3
Results with aging and fitness. The analysis of birth processes becomes harder when we also consider fitness. First of all, we define the birth process with aging and fitness as follows: Definition 2.4 (Aging birth process with fitness). Consider a birth process (V t ) t≥0 . Let g : R + → R + be an aging function, and Y a positive random variable. The process M t := V Y G(t) is called a birth process with aging and fitness.
Definition 2.4 implies that the infinitesimal jump rates of the process (M t ) t≥0 are as in (2.1), so that the birth probabilities of an individual depend on the PA weights, the age of the individual and on its fitness. Assuming that the process (M t ) t≥0 is supercritical and Malthusian, we can prove the following theorem: Theorem 2.5 (Limiting degree distribution for aging and fitness). Consider a process (M t ) t≥0 with integrable aging function g, fitnesses that are i.i.d. across the population, and assume that it is supercritical and Malthusian with Malthusian parameter α * . Then, the limiting degree distribution for the corresponding branching process is given by For a fixed individual, the distribution (q k ) k∈N of the number of children it generates over its entire lifetime is given by Similarly to Theorem 2.2, the sequence (L(k, α * , Y )) k∈N is given bŷ where again L(h(·))(α) denotes the Laplace transform of a function h. Notice that in this case, with the presence of the fitness Y , this sequence is no longer deterministic but random instead. We still have the product structure for (p k ) k∈N as in the stationary case, but now we have to average over the fitness distribution. We point out that Theorem 2.2 is a particular case of Theorem 2.5, when we consider Y ≡ 1. We state the two results as separate theorems to improve the logic of the presentation. We prove Theorem 2.5 in Section 4.1. In Section 4.2 we show how Theorem 2.2 can be obtained from Theorem 2.5, and in particular how Condition (2.5) is obtained from the analogous Condition (2.6) stated below for general fitness distributions.
With affine PA weights, in Proposition 5.5, we can identify the asymptotics of the limiting degree distribution we obtain. This is proved by similar techniques as in the case of aging only, even though the result cannot be expressed so easily. In particular, we prove where the function Ψ k (t, s) depends on the aging function, the density µ of the fitness and k. The is a bivariate normal vector with covariance matrix related to H k (t, s). We do not know the necessary and sufficient conditions for the existence of such a minimum (t k , s k ). However, in Section 5.3, we consider two examples where we can apply this result, and we show that it is possible to obtain power-laws for them.
In the case of aging and fitness, the supercriticality condition in (2.5) is replaced by the analogous condition that Borgs et al. [6] and Dereich [9], [10] prove results on stationary CTBPs with fitness. In these works, the authors investigate models with affine dependence on the degree and bounded fitness distributions. This is necessary since unbounded distributions with affine weights are explosive and thus do not have Malthusian parameter. We refer to Section 3.3 for a more precise discussion of the conditions on fitness distributions.
In the case of integrable aging and fitness, it is possible to consider affine PA weights, even with unbounded fitness distributions, as exemplified by (2.6). In particular, for f k = ak + b, As a consequence, Condition (2.6) can be written as The expected value E e aY G(t) is the moment generating function of Y evaluated in aG(t). In particular, a necessary condition to have a Malthusian parameter is that the moment generating function is finite on the interval [0, aG(∞)). As a consequence, denoting E[e sY ] by ϕ Y (s), we have effectively moved from the condition of having bounded distributions to the condition Condition (2.8) is weaker than assuming a bounded distribution for the fitness Y , which means we can consider a larger class of distributions for the aging and fitness birth processes. Particularly for citation networks, it seems reasonable to have unbounded fitnesses, as the relative popularity of papers varies substantially.
2.4 Dynamical power-laws for exponential fitness and integrable aging. In Section 5.3 we introduce three different classes of fitness distributions, for which we give the asymptotics for the limiting degree distribution of the corresponding CTBP. The first class is called heavy-tailed. Recalling (2.8), any distribution Y in this class satisfies, for any t > 0, ϕ Y (t) = E e tY = +∞. (2.9) These distributions have a tail that is thicker than exponential. For instance, power-law distributions belong to this first class. Similarly to unbounded distributions in the stationary regime, such distributions generate explosive birth processes, independent of the choice of the integrable aging functions.
The second class is called sub-exponential. The density µ of a distribution Y in this class satisfies An example of this class is the density µ(s) = Ce −θs 1+ε , for some ε, C, θ > 0. For such density, we show in Proposition 5.7 that the corresponding limiting degree distribution has a thinner tail than a power-law. The third class is called general-exponential. The density µ of a distribution Y in this class is of the form where h(s) is a twice differentiable function such that h (s)/h(s) → 0 and h (s)/h(s) → 0 as s → ∞, and C is a normalization constant. For instance, exponential and Gamma distributions belong to this class. From (2.8), we know that in order to obtain a non-explosive process, it is necessary to consider the exponential rate θ > aG(∞). We will see that the limiting degree distribution obeys a power law as θ > aG(∞) with tails becoming thinner when θ increases.
For a distribution in the general exponential class, as proven in Proposition 5.6, the limiting degree distribution of the corresponding CTBP has a power-law term, with slowly-varying corrections given by the aging function g and the function h. We do not state Propositions 5.6 and 5.7 here, as these need notation and results from Section 5.1. For this reason, we only state the result for the special case of purely exponential fitness distribution: Corollary 2.6 (Exponential fitness distribution). Let the fitness distribution Y be exponentially distributed with parameter θ, and let g be an integrable aging function. Assume that the corresponding birth process (M t ) t≥0 is supercritical and Malthusian. Then, the limiting degree distribution (p k ) k∈N of the corresponding CTBP M is The distribution (q k ) k∈N of the number of children of a fixed individual in its entire lifetime is given by Using exponential fitness makes the computation of the Laplace transform and the limiting degree distribution easier. We refer to Section 5.4 for the precise proof. In particular, the sequence defined in Corollary 2.6 is very similar to the limiting degree distribution of a stationary process with a bounded fitness. Let (ξ Y t ) t≥0 be a birth process with PA weights (f k ) k∈N and fitness Y with bounded support. As proved in [10, Corollary 2.8], and as we show in Section 3.3, the limiting degree distribution of the corresponding branching process, assuming that (ξ Y t ) t≥0 is supercritical and Malthusian, has the form We notice the similarities with the limiting degree sequence given by Corollary 2.6. When g is integrable, the random variable G(T α * ) has bounded support. In particular, we can rewrite the sequence of the Corollary 2.6 as As a consequence, the limiting degree distribution of the process (M t ) t≥0 equals that of a stationary process with fitness G(T α * ) and Malthusian parameter θ.
In the case where Y has exponential distribution and the PA weights are affine, we can also investigate the occurrence of dynamical power laws. In fact, with (M t ) t≥0 such a process, the exponential distribution Y leads to Γ(aG(t)) ) . (2.12) Here, M t describes the number of children of an individual of age t. In other words, (P(M t = k)) k∈N is a distribution such that, as k → ∞, (1)).
This means that for every time t ≥ 0, the random variable M t has a power-law distribution with exponent τ (t) = 1 + θ/aG(t) > 2. In particular, for every t ≥ 0, M t has finite expectation. We call this behavior where power laws occur that vary with the age of the individuals a dynamical power law. This occurs not only in the case of pure exponential fitness, but in general for every distribution as in (2.11), as shown in Proposition 5.6 below. Further, we see that when t → ∞, the dynamical power-law exponent coincides with the powerlaw exponent of the entire population. Indeed, the limiting degree distribution equals .
(2.13) In Figure 9, we show a numerical example of the dynamical power-law for a process with exponential fitness distribution and affine weights. When time increases, the power-law exponent monotonically decreases to the limiting exponent τ ≡ τ (∞) > 2, which means that the limiting distribution still has finite first moment. Note the similarity to the case of citation networks in Figure 3.
When t → ∞, the power-law exponent converges, and also M t converges in distribution to a limiting random distribution M ∞ given by . (2.14) M ∞ has a power-law distribution, where the power-law exponent is In particular, since τ > 2, a fixed individual has finite expected number of children also in its entire lifetime, unlike the stationary case with affine weights. In terms of citation networks, this type of processes predicts that papers do not receive an infinite number of citations after they are published (recall Figure 5). Figure 8 shows the effect of aging on the stationary process with affine weights, where the power-law is lost due to the aging effect. Thus, aging slows down the stationary process, and it is not possible to create the amount of high-degree vertices that are present in power-law distributions. Fitness can speed up the aging process to gain high-degree vertices, so that the power-law distribution is restored. This is shown in Figure 10, where aging is combined with exponential fitness for the same aging functions as in Figure 8.
In the stationary case, it is not possible to use unbounded distributions for the fitness to obtain a Malthusian process if the PA weights (f k ) k∈N are affine. In fact, using unbounded distributions, the expected number of children at exponential time T α is not finite for any α > 0, i.e., the branching process is explosive. The aging effect allows us to relax the condition on the fitness, and the restriction to bounded distributions is relaxed to a condition on its moment generating function.

Conclusion and open problems.
Beyond the tree setting. In this paper, we only consider the tree setting, which is clearly unrealistic for citation networks. However, the analysis of PAMs has shown that the qualitative features of the degree distribution for PAMs are identical to those in the tree setting. Proving this remains an open problem that we hope to address hereafter. Should this indeed be the case, then we could summarize our findings in the following simple way: The power-law tail distribution of PAMs is destroyed by integrable aging, and cannot be restored either by super-linear weights or by adding bounded fitnesses. However, it is restored by unbounded fitnesses with at most an exponential tail. Part of these results are example based, while we have general results proving that the limiting degree distribution exists.
Structure of the paper. The present paper is organized as follows. In Section 3, we quote general results on CTBPs, in particular Theorem 3.10 that we use throughout our proofs. In Section 3.2, we describe known properties of the stationary regime. In Section 3.3, we briefly discuss the Malthusian parameter, focusing on conditions on fitness distributions to obtain supercritical processes. In Section 4, we prove Theorem 2.3 and 2.5, and we show how Theorem 2.2 is a particular case of Theorem 2.5. In Section 5 we specialize to the case of affine PA function, giving precise asymptotics.

General theory of Continuous-Time Branching Processes
3.1 General set-up of the model. In this section we present the general theory of continuoustime branching processes (CTBPs). In such models, individuals produce children according to i.i.d. copies of the same birth process. We now define birth processes in terms of point processes: A point process ξ is defined by a sequence of positive real-valued random variables (T k ) k∈N . With abuse of notation, we can denote the density of the point process ξ by where δ x (dt) is the delta measure in x, and the random measure ξ evaluated on [0, t] as We suppose throughout the paper that T k < T k+1 with probability 1 for every k ∈ N.
Remark 3.2. Equivalently, considering a sequence (T k ) k∈N (where T 0 = 0) of positive real-valued random variables, such that T k < T k+1 with probability 1, we can define We will often define a point process from the jump-times sequence of an integer-valued process (V t ) t≥0 . For instance, consider (V t ) t≥0 as a Poisson process, and denote T k = inf{t > 0 : V t ≥ k}.
Then we can use the sequence (T k ) k∈N to define a point process ξ. The point process defined from the jump times of a process (V t ) t≥0 will be denoted by ξ V .
We now introduce some notation before giving the definition of CTBP. We denote the set of individuals in the population using Ulam-Harris notation for trees. The set of individuals is For x ∈ N n and k ∈ N we denote the k-th child of x by xk ∈ N n+1 . This construction is well known, and has been used in other works on branching processes (see [15], [18], [23] for more details).
We now are ready to define our branching process: Definition 3.3 (Continuous-time branching process). Given a point process ξ, we define the CTBP associated to ξ as the pair of a probability space and an infinite set (ξ x ) x∈N of i.i.d. copies of the process ξ. We will denote the branching process by ξ.
Remark 3.4 (Point processes and their jump times). Throughout the paper, we will define point processes in terms of jump times of processes (V t ) t≥0 . In order to keep the notation light, we will denote branching processes defined by point processes given by jump times of the process V t by V .
To make it more clear, by V we denote a probability space as in Definition 3.3 and an infinite set of measures (ξ x V ) x∈N , where ξ V is the point process defined by the process V .
According to Definition 3.3, a branching process is a pair of a probability space and a sequence of random measures. It is possible though to define an evolution of the branching population. At time t = 0, our population consists only of the root, denoted by ∅. Every time t an individual x gives birth to its k-th child, i.e., ξ x (t) = k + 1, assuming that ξ x (t−) = k, we start the process ξ xk . Formally: Definition 3.5 (Population birth times). We define the sequence of birth times for the process ξ as τ ξ ∅ = 0, and for x ∈ N , τ ξ xk = τ ξ x + inf {s ≥ 0 : ξ x (s) ≥ k} .
In this way we have defined the set of individuals, their birth times and the processes according to which they reproduce. We still need a way to count how many individuals are alive at a certain time t. garavaglia, van der hofstad, woeginger Definition 3.6 (Random characteristic). A random characteristic is a real-valued process Φ : Ω × R → R such that Φ(ω, s) = 0 for any s < 0, and Φ(ω, s) = Φ(s) is a deterministic bounded function for every s ≥ 0 that only depends on ω through the birth time of the individual, as well as the birth process of its children.
An important example of a random characteristic is obtained by the function 1 R + (s), which measures whether the individual has been born at time s. Another example is 1 R + (s)1 {k} (ξ), which measures whether the individual has been born or not at time s and whether it has k children presently.
For each individual x ∈ N , Φ x (ω, s) denotes the value of Φ evaluated on the progeny of x, regarding x as ancestor, when the age of x is s. In other words, Φ x (ω, s) is the evaluation of Φ on the tree rooted at x, ignoring the rest of the population. If we do not specify the individual x, then we assume that Φ = Φ ∅ . We use random characteristics to describe the properties of the branching population.
Definition 3.7 (Evaluated branching processes). Consider a random characteristic Φ as in Definition 3.6. We define the evaluated branching processes with respect to Φ at time t ∈ R + as The meaning of the evaluated branching process is clear when we consider the random charac- which is the number of x ∈ N such that t−τ ξ x ≥ 0, i.e., the total number of individuals already born up to time t. Another characteristic that we consider in this paper is, for x is the number of individuals with k children at time t.
As known from the literature, the properties of the branching process are determined by the behavior of the point process ξ. First of all, we need to introduce some notation. Consider a function f : R + → R. We denote the Laplace transform of f by With a slight abuse of notation, if µ is a positive measure on R + , then we denote We use the Laplace transform to analyze the point process ξ: Definition 3.8 (Supercritical property). Consider a point process ξ on R + . We say ξ is supercritical when there exists α * > 0 such that We call α * the Malthusian parameter of the process ξ.

the dynamics of power laws
We point out that Eξ(d·) is an abuse of notation to denote the density of the averaged measure E[ξ([0, t])]. A second fundamental property for the analysis of branching processes is the following: Definition 3.9 (Malthusian property). Consider a supercritical point process ξ, with Malthusian parameter α * . The process ξ is Malthusian when and we will also assume that the process satisfies the condition Integrating by parts, it is possible to show that, for a point process ξ, where T α is an exponentially distributed random variable independent of the process (V t ) t≥0 . Heuristically, the Laplace transform of a point process ξ V is the expected number of children born at exponentially distributed time T α . In this case the Malthusian parameter is the exponential rate α * such that at time T α * exactly one children has been born. These two conditions are required to prove the main result on branching processes that we rely upon: Theorem 3.10 (Population exponential growth). Consider the point process ξ, and the corresponding branching process ξ. Assume that ξ is supercritical and Malthusian with parameter α * , and suppose that there existsᾱ < α * such that ∞ 0 e −ᾱt Eξ(dt) < ∞.

Then
(1) there exists a random variable Θ such that as t → ∞, (2) for any two random characteristics Φ and Ψ, This result is stated in [23, Theorem A], which is a weaker version of [18, Theorem 6.3]. Formula (3.3) implies that, P-a.s., the population size grows exponentially with time. It is relevant though to give a description of the distribution of the random variable Θ: Formula (3.4) says that the ratio between the evaluation of the branching process with two different characteristics converges P-a.s. to a constant that depends only on the two characteristics involved. In particular, if we consider, for k ∈ N, then Theorem 3.10 gives since L(E[1 R + (·)])(α * ) = 1/α * . The ratio in the previous formula is the fraction of individuals with k children in the whole population: Definition 3.12 (limiting degree distribution for CTBP). The sequence (p k ) k∈N , where is the limiting degree distribution for the branching process ξ.
The aim of the following sections will be to study when point processes satisfy the conditions of Theorem 3.10, in order to analyze the limiting degree distribution in Definition 3.12.

Stationary birth processes with no fitness.
In this section we present the theory of birth processes that are stationary and have deterministic rates. This is relevant since the definition of aging processes starts with a stationary process. In particular, we give description of the affine case, which plays a central role in the present work: Definition 3.13 (Stationary non-fitness birth processes). Consider a non-decreasing sequence (f k ) k∈N of positive real numbers. A stationary non-fitness birth process is a stochastic process (V t ) t≥0 such that (1) V 0 = 0, and V t ∈ N for all t ∈ R + ; (2) V t ≤ V s for every t ≤ s; (3) for h small enough, We denote the jump times by (T k ) k∈N , i.e., We denote the point process corresponding to (V t ) t≥0 by ξ V . In this case, (V t ) t≥0 is an inhomogeneous Poisson process, and for every k ∈ N, T k+1 − T k has exponential law with parameter f k independent of (T h+1 − T h ) k−1 h=0 . It is possible to show the following proposition: Proposition 3.14 (Probabilities for (V t ) t≥0 ). Consider a stationary non-fitness birth process (V t ) t≥0 . Denote, for every k ∈ N, P(V t = k) = P k [V ](t). Then

8)
and, for k ≥ 1, the dynamics of power laws For a proof, see [4, Chapter 3, Section 2]. From the jump times, it is easy to compute the explicit expression for the Laplace transform of ξ V as since every T k can be seen as sum of independent exponential random variables with parameters given by the sequence (f k ) k∈N . Assuming now that ξ V is supercritical and Malthusian with parameter α * , we have the explicit expression for the limit distribution (p k ) k∈N , given by (2.2). An analysis of the behavior of the limit distribution of branching processes is presented in [2] and [22], where the authors prove that (p k ) k∈N has a power-law tail only if the sequence of rates (f k ) k∈N is asymptotically linear with respect to k. (1) for every α ∈ R + , (2) The Malthusian parameter is α * = a + b, andα = a, whereα is defined as in (3.1). (
For affine PA weights (f k ) k∈N = (ak + b) k∈N , the Malthusian parameter α * exists. Since α * = a + b, the limiting degree distribution of the branching process V is given by (3.10) Notice that p k has a power-law decay with exponent τ = 2 + b a . Branching processes of this type are related to PAM, also called the Barabási-Albert model ( [1]). This model shows the so-called old-get-richer effect. Clearly this is not true for real-world citation networks. In Figure 5, we notice that, on average, the increment of the citation received by old papers is smaller than the increment of younger papers. Rephrasing it, old papers tend to be cited less and less over time.

The Malthusian parameter.
The existence of the Malthusian parameter is a necessary condition to have a branching process growing at exponential rate. In particular, the Malthusian parameter does not exist in two cases: when the process is subcritical and grows slower than exponential, or when it is explosive. In the first case, the branching population might either die out or grow indefinitely with positive probability, but slower than at exponential rate. In the second case, the population size explodes in finite time with probability one. In both cases, the behavior of the branching population is different from what we observe in citation networks (Figure 1). For this reason, we focus on supercritical processes, i.e., on the case where the Malthusian parameter exists.
Denote by (V t ) t≥0 a stationary birth process defined by PA weights (f k ) k∈N . In general, we assume f k → ∞. Denote the sequence of jump times by (T k ) k∈N . As we quote in Section 3.2, the Laplace transform of a birth process (V t ) t≥0 is given by Such expression comes from the fact that, in stationary regime, T k is the sum of k independent exponential random variables. We can write (1)) .
The behavior of the Laplace transform depends on the asymptotic behavior of the PA weights. We define now the terminology we use: Definition 3.16 (Superlinear PA weights). Consider a PA weight sequence (f k ) k∈N . We say that the PA weights are As a general example, consider f k = ak q + b, where q > 0. In this case, the sequence is affine when q = 1, superlinear when q > 1 and sublinear when q < 1.
When the weights are superlinear, since This holds for every α > 0. As a consequence, the Laplace transform L(EV (d·))(α) is always infinite, and there exist no Malthusian parameter. In particular, if we denote by T ∞ = lim k→∞ T k , then T ∞ < ∞ a.s.. This means that the birth process (V t ) t≥0 explodes in a finite time. When the weights are at most linear, the bound in (3.11) does not hold anymore. In fact, consider as example affine weights f k = ak + b. We have that k−1 i=0 1 f i = (1/a) log k(1 + o(1)). As a consequence, the Laplace transform can be written as (1)). (3.12) In this case, the Laplace transform is finite for α > a. For the sublinear case, for which This sum is finite for any α > 0.
We can now introduce fitness in the stationary process: Remark 3.17. Consider the process (V t ) t≥0 defined by the sequence of PA weights (f k ) k∈N as in Section 3.2. For u ∈ R + we denote by (V u t ) t≥0 the process defined by the sequence (uf k ) k∈N . It is easy to show that L(Eξ V u (d·))(α) = L(Eξ V (d·))(α/u).
The behavior of the degree sequence of (V u t ) t≥0 is the same of the process V t .
Remark 3.17 shows a sort of monotonicity of the Laplace transform with respect to the sequence (f k ) k∈N . This is very useful to describe the Laplace transform of a birth process with fitness, which we define now: the dynamics of power laws Definition 3.18 (Stationary fitness birth processes). Consider a birth process (V t ) t≥0 defined by a sequence of weights (f k ) k∈N . Let Y be a positive random variable. We call stationary fitness birth processes the process (V Y t ) t≥0 , defined by the random sequence of weights (Y f k ) k∈N , i.e., conditionally on Y , By Definition 3.18, it is obvious that the properties of the process (V Y t ) t≥0 are related to the properties of (V t ) t≥0 . Since we consider a random fitness Y independent of the process (V t ) t≥0 , from Remark 3.17 it follows that For affine weights the fitness distribution needs to be bounded, as discussed in Section 2.4. In this section we give a qualitative explanation of this fact. Consider the sum in the expectation in the right hand term of (3.13). We can rewrite the sum as (1)) . (3.14) The behavior depends sensitively on the asymptotic behavior of the PA weights. In particular, a necessary condition for the existence of the Malthusian parameter is that the sum in (3.13) is finite on an interval of the type (α, +∞). In other words, since the Laplace transform is a decreasing function (when finite), we need to prove the existence of a minimum valueα such that it is finite for every α >α. Using (3.14) in (3.13), we just need to find a value α such that the right hand side of (3.14) equals 1.
In the case of affine weights (1)), for a constant C. As a consequence, (3.14) is equal to The sum inside the last expectation is finite only on the event {Y < Cα}. If Y has an unbounded distribution, then for every value of α > 0 we have that {Y ≥ Cα} is an event of positive probability. As a consequence, for every α > 0, the Laplace transform of the birth process (V Y y ) t≥0 is infinite, which means there exists no Malthusian parameter. This is why a bounded fitness distribution is necessary to have a Malthusian parameter using affine PA weights. The situation is different in the case of sublinear weights. For example, consider f k = (1 + k) q , where q ∈ (0, 1). Then, the difference to affine weights is that now (1)). Using this in (3.14), we obtain In this case, since both α and Y are always positive, the last sum is finite with probability 1, and the expectation might be finite under appropriate moment assumptions on Y . Assume now that the fitness Y satisfies the necessary conditions, so that the process (V Y t ) t≥0 is supercritical and Malthusian with parameter α * . We can evaluate the limiting degree distribution. Conditioning on Y , the Laplace transform of garavaglia, van der hofstad, woeginger so, as a consequence, the limiting degree distribution of the branching processes is It is possible to see that the right-hand side of (3.16) is similar to the distribution of the simpler case with no fitness given by (2.2). We still have a product structure for the limit distribution, but in the fitness case it has to be averaged over the fitness distribution. This result is similar to [10, Theorem 2.7, Corollary 2.8].
Considering affine weights f k = ak + b, we can rewrite (3.16) as .
Asymptotically in k, the argument of the expectation in the previous expression is random with a power-law exponent τ (Y ) = 1 + α * /(aY ). For example, in this case averaging over the fitness distribution, it is possible to obtain power-laws with logarithmic corrections (see eg [5,Corollary 32]).

Existence of limiting distributions
In this section, we give the proof of Theorems 2.2, 2.3 and 2.5, proving that the branching processes defined in Section 2 do have a limiting degree distribution. As mentioned, we start by proving Theorem 2.5, and then explain how Theorem 2.2 follows as special case. Before proving the result, we do need some remarks on the processes we consider. Birth process with aging alone and aging with fitness are defined respectively in Definition 2.1 and 2.4. Consider then a process with aging and fitness (M t ) t≥0 as in Definition 2.4. Let (T k ) k∈N denote the sequence of birth times, i.e., It is an immediate consequence of the definition that, for every k ∈ N, where (T k ) k∈N is the sequence of birth times of a stationary birth process (V t ) t≥0 defined bu the same PA function f . Consider then the sequence of functions (P k [V ](t)) k∈N associated with the stationary process (V t ) t≥0 defined by the same sequence of weights (f k ) k∈N (see Proposition 3.14). As a consequence, for every k ∈ N, P(M t = k) = E[P k [V ](Y G(t))], and the same holds for an aging process just considering Y ≡ 1. Formula (4.1) implies that the aging process is the stationary process with a deterministic time-change given by G(t). A process with aging and fitness is the stationary process with a random time-change given by Y G(t).
Assume now that g is integrable, i.e. lim t→∞ G(t) = G(∞) < ∞. Using (4.1) we can describe the limiting degree distribution (q k ) k∈N of a fixed individual in the branching population, i.e., the distribution N ∞ (or M ∞ ) of the total number of children an individual will generate in its entire lifetime. In fact, for every k ∈ N, which means that N ∞ has the same distribution as V G(∞) . With fitness,  Condition (2.6)). Consider a stationary process (V t ) t≥0 , an integrable aging function g and a random fitness Y . Assume that E[V t ] < ∞ for every t ≥ 0. Then the process (V Y G(t) ) t≥0 is supercritical if and only if Condition (2.6) holds, i.e., Proof. For the if part, we need to prove that As before, (T k ) k∈N are the jump times of the process (V G(t) ) t≥0 . Then When α → ∞ we have E V Y G(u/α) → 0. Then, fix α 0 > 0 such that E V Y G(u/α) < 1 for every α > α 0 . As a consequence, e −u E V Y G(u/α) du ≤ e −u for any α > α 0 . By dominated convergence,

Now suppose Condition (2.6) does not hold. This means that
If the first condition holds, then there exists t 0 ∈ (0, aG(∞)) such that E V Y G(t) = +∞ for every t ≥ t 0 (recall that E V Y G(t) in an increasing function in t). As a consequence, for every α > 0, we have E V Y G(Tα) = +∞, which means that the process is explosive.
If the second condition holds, then for every α > 0 the Laplace transform of the process is strictly less than 1, which means there exists no Malthusian parameter. Lemma 4.1 gives a weaker condition on the distribution Y than requiring it to be bounded. Now, we want to investigate the degree distribution of the branching process, assuming that the process (M t ) t≥0 is supercritical and Malthusian. Denote the Malthusian parameter by α * . The above allows us to complete the proof of Theorem 2.5: Proof of Theorem 2.5. We start from Conditioning on Y and integrating by parts in the integral given by the expectation in (4.3), gives garavaglia, van der hofstad, woeginger Now, we defineL (k, α * , Y ) = L(P V uG(·) = k g(·))(α * ) Notice that the sequence (L(k, α * , Y )) k∈N is a sequence of random variables. Multiplying both sides of the equation by α * , on the right hand side we have while on the left hand side we have As a consequence, We start from p 0 , that is given by Recursively using (4.5), gives Taking expectation on both sides gives Now the sequence (L(k, α * , Y )) k∈N creates a relation among the sequence of weights, the aging function and the fitness distribution, so that these three ingredients are deeply related.
4.2 Proof of Theorems 2.2 and 2.3. As mentioned, Theorem 2.2 follows immediately by considering Y ≡ 1. The proof in fact is the same, since we can express the probabilities P(N t = k) as function of the stationary process (V t ) t≥0 defined by the same PA function f . Condition (2.5) immediately follows from Condition (2.6). In fact, considering Y ≡ 1, Condition The first inequality in general true for the type of stationary process we consider (for instance with f affine). The second inequality is exaclty Condition (2.5).
The expression of the sequence (L g (k, α * )) k∈N is simplier than the general case given in (4.4). In fact, in (4.4), the sequence (L(k, α * , Y )) k∈N is actually a squence of random variables. In the case of aging alone,L g (k, α * ) = L(P V G(·) = k g(·))(α * ) which is a deterministic sequence.
Remark 4.2. Notice thatL g (k, α * ) = 1 when g(t) ≡ 1, so that G(t) = t for every t ∈ R + and there is no aging, and we retrieve the stationary process (V t ) t≥0 .
Unfortunately, the explicit expression of the coefficients (L g (k, α * )) k∈N is not easy to find, even though they are deterministic. Theorem 2.3, which states that even if g is integrable, the aging does not affect the explosive behavior of a birth process with superlinear weights, is a direct consequence of (4.2): Proof of Theorem 2.3. Consider a birth process (V t ) t≥0 , defined by a sequence of superlinear weights (f k ) k∈N (in the sense of Definition 3.16), and an integrable aging function g. Then, for every t > 0, Since this holds for every t > 0, the process (N t ) t≥0 is explosive. As a consequence, for any α > 0, E [N Tα ] = ∞, which means that there exists no Malthusian parameter.

5.
Affine weights and adapted Laplace method 5.1 Aging and no fitness. In this section, we consider affine PA weights, i.e., we consider f k = ak + b. The main aim is to identify the asymptotic behavior of the limiting degree distribution of the branching process with aging. Consider a stationary process (V t ) t≥0 , where f k = ak + b. Then, for any t ≥ 0, it is possible to show by induction and the recursions in (3.8) and (3.9) that We omit the proof of (5.1). As a consequence, since the corresponding aging process is (V G(t) ) t≥0 , the limiting degree distribution is given by We can obtain an immediate upper bound for p k , in fact which implies that the distribution (p k ) k∈N has at most an exponential tail. A more precise analysis is hard. Instead we will give an asymptotic approximation, by adapting the Laplace method for integrals to our case. The Laplace method states that, for a function f that is twice differentiable and with a unique absolute minimum In this situation, the interval [a, b] can be infinite. The idea behind this result is that, when k 1, the major contribution to the integral comes from a neighborhood of x 0 where e −kΨ(x) is maximized. In the integral in (5.2), we do not have this situation, since we do not have an integral of the type (5.3). Defining we can rewrite the integral in (5.2) as garavaglia, van der hofstad, woeginger The derivative of the function Ψ k (t) is In particular, if there exists a minimum t k , then it depends on k. In this framework, we cannot directly apply the Laplace method. We now show that we can apply a result similar to (5.3) even to our case: Lemma 5.1 (Adapted Laplace method 1). Consider α, a, b > 0. Let the integrable aging function g be such that (1) for every t ≥ 0, 0 < g(t) ≤ A < ∞; (2) g is differentiable on R + , and g is finite almost everywhere; (3) there exists a positive constant B < ∞ such that g(t) is decreasing for t ≥ B; (4) assume that the solution t k of Ψ k (t) = 0, for Ψ k (t) as in (5.6), is unique, and g (t k ) < 0.
Then, for σ 2 k = (kΨ k (t k )) −1 , there exists a constant C such that, as k → ∞, where N (0, σ 2 k ) denotes a normal distribution with zero mean and variance σ 2 k .
Since Lemma 5.1 is an adapted version of the classical Laplace method, we move the proof to Appendix B. We can use the result of Lemma 5.1 to prove: Proposition 5.2 (Asymptotics -affine weights, aging, no fitness). Consider the affine PA weights f k = ak + b, an integrable aging function g, and denote the limiting degree distribution of the corresponding branching process by (p k ) k∈N . Then, under the hypotheses of Lemma 5.1, there exists a constant C > 0 such that, as k → ∞, where D k (g) = 1 2

Aging and fitness case.
In this section, we investigate the asymptotic behavior of the limiting degree distribution of a CTBP, in the case of affine PA weights. The method we use is analogous to that in Section 5.1. We assume that the fitness Y is absolutely continuous with respect to the Lebesgue measure, and we denote its density function by µ. The limiting degree distribution of this type of branching process is given by We immediately see that the degree distribution has exponential tails when the fitness distribution is bounded: the dynamics of power laws Lemma 5.3 (Exponential tails for integrable aging and bounded fitnesses). When there exists γ such that µ([0, γ]) = 1, i.e., the fitness has a bounded support, then In particular, p k has exponential tails.
Like in the situation with only aging, the explicit solution of the integral in (5.8) may be hard to find. We again have to adapt the Laplace method to estimate the asymptotic behavior of the integral. We write As before, we want to minimize the function Ψ k . We state here the lemma: Lemma 5.4 (Adapted Laplace method 2). Let Ψ k (t, s) as in (5.11). Assume that (1) g satisfies the assumptions of Lemma 5.1; (2) µ is twice differentiable on R + ; (3) there exists a constant B > 0 such that, for every s ≥ B , µ is monotonically decreasing; (4) (t k , s k ) is the unique point where both partial derivatives are zero; (5) (t k , s k ) is the absolute minimum for Ψ k (t, s); (6) the hessian matrix H k (t k , s k ) of Ψ k (t, s) evaluated in (t k , s k ) is positive definite. Then, where (N 1 (k), N 2 (k)) := N (0, (kH k (t k , s k )) −1 ) is a bivariate normal distributed vector and 0 = (0, 0).
The proof of Lemma 5.4 can be found in Appendix B.1. Using Lemma 5.4 we can describe the limiting degree distribution (p k ) k∈N : Proposition 5.5 (Asymptotics -affine weights, aging, fitness). Consider affine PA weights f k = ak + b, an integrable aging function g and a fitness distribution density µ. Assume that the corresponding branching process is supercritical and Malthusian. Under the hypotheses of Lemma 5.4, the limiting degree distribution (p k ) k∈N of the corresponding CTBP satisfies garavaglia, van der hofstad, woeginger 5.3 Three classes of fitness distributions. Proposition 5.5 in Section 5.2 gives the asymptotic behavior of the limiting degree distribution of a CTBP with integrable aging and fitness. Lemma 5.4 requires conditions under which the function Ψ k (t, s) as in (5.11) has a unique minimum point denoted by (t k , s k ). In this section we consider the three different classes of fitness distributions that we have introduced in Section 2.4. For the heavy-tailed class, i.e., for distributions with tail thicker than exponential, there is nothing to prove. In fact, (2.7) immediately implies that such distributions are explosive.
For the other two cases, we apply Proposition 5.5, giving the precise asymptotic behavior of the limiting degree distributions of the correponding CTBPs. Propositions 5.6 and 5.7 contain the results respectively on the general-exponential and sub-exponential classes. The proof of these propositions are moved to Appendix C.
Proposition 5.6. Consider a general exponential fitness distribution as in (2.11). Let (M t ) t≥0 be the corresponding birth process. Denote the unique minimum point of Ψ k (t, s) as in (5.11) by (t k , s k ). Then (1) for every t ≥ 0, M t has a dynamical power law with exponent τ (t) = 1 + θ aG(t) ; (2) the asymptotic behavior of the limiting degree distribution (p k ) k∈N is given by where the power law term has exponent τ = 1 + θ/aG(∞); (3) the distribution (q k ) k∈N of the total number of children of a fixed individual has a power law behavior with exponent τ = 1 + θ/aG(∞).
By (2.8) it is necessary to consider the exponential rate θ > aG(∞) to obtain a non-explosive process. In particular, this implies that, for every t ≥ 0, τ (t), as well as τ , are strictly larger than 2. As a consequence, the three distributions (P k [M ](t)) k∈N , (p k ) k∈N and (q k ) k∈N have finite first moment. Increasing the value of θ leads to power-law distributions with exponent larger than 3, so with finite variance.
A second observation is that, independently of the aging function g, the point s k is of order log k. In particular, this has two consequences. First the correction to the power law given by h(s k ) is a power of log k. Since h (s)/h(s) → 0 as s → ∞. Second the power-law term k −(1+θ/(aG(∞))) arises from µ(s k ). This means that the exponential term in the fitness distribution µ not only is necessary to obtain a non-explosive process, but also generates the power law.
The third observation is that the behavior of the three distributions (P k [M ](t)) k∈N , (p k ) k∈N and (q k ) k∈N depends on the integrability of the aging function, but does only marginally depends on its precise shape. The contribution of the aging function g to the exponent of the power law in fact is given only by the value G(∞). The other terms that depend directly on the shape of g are e −α * t k and the ratio g (t k )/g(t k ). The ratio g /g does not contribute for any function g whose decay is in between power law and exponential. The term e −α * t k depends on the behavior of t k , that can be seen as roughly g −1 (1/ log k). For any function between power law and exponential, e −α * t k is asymptotic to a power of log k.
The last observation is that every distribution in the general exponential class shows a dynamical power law as for the pure exponential distribution, as shown in Section 5.4. The pure exponential distribution is a special case where we consider h(s) ≡ 1. Interesting is the fact that τ actually does not depend on the choice of h(s), but only on the exponential rate θ > aG(∞). In particular, Proposition 5.6 proves that the limiting degree distribution of the two examples in Figure 9 have power-law decay.
We move to the class of sub-exponential fitness. We show that the power law is lost due to the absence of a pure exponential term. We prove the result using densities of the form for ε > 0 and C the normalization constant. The result is the following: Proposition 5.7. Consider a sub-exponential fitness distribution as in (5.12). Let (M t ) t≥0 be the corresponding birth process. Denote the minimum point of Ψ k (t, s) as in (5.11) by (t k , s k ). Then (1) for every t ≥ 0, M t satisfies (1 + o(1)); (2) the limiting degree distribution (p k ) k∈N of the CTBP has asymptotic behavior given by (1 + o(1)); (3) the distribution (q k ) k∈N of the total number of children of a fixed individual satisfies (1 + o(1)).
In Proposition 5.7 the distributions (P k [M ](t)) k∈N , (p k ) k∈N and (q k ) k∈N decay faster than a power law. This is due to the fact that a sub-exponential tail for the fitness distribution does not allow the presence of sufficiently many individuals in the branching population whose fitness value is sufficiently high to restore the power law.
In this case, we have that s k is roughly c 1 log k − c 2 log log k. Hence, as first approximation, s k is still of logarithmic order. The power-law term is lost because there is no pure exponential term in the distribution µ. In fact, in this case µ(s k ) generates the dominant term e −θ(log k) 1+ε .

5.4
The case of exponentially distributed fitness: Proof of Corollary 2.6. The case when the fitness Y is exponentially distributed turns out to be simpler. In this section, denote the fitness by T θ , where θ is the parameter of the exponential distribution. First of all, we investigate the Laplace transform of the process. In fact, we can write which is the Laplace transform of the stationary process (V sG(Tα) ) s≥0 with bounded fitness G(T α ) in θ. As a consequence, Suppose that there exists a Malthusian parameter α * . This means that, for fixed (f k ) k∈N , g and θ, α * is the unique value such that E M T α * = 1. As a consequence, if we fix (f k ) k∈N , g and α * , θ is the unique value such that Therefore θ is the Malthusian parameter of the process (V sG(Tα) ) s≥0 . We are now ready to prove Corollary 2.6: Proof of Corollary 2.6. We can write P (M t = k) = P V T θ G(t) = k , which means that we have to evaluate the Laplace transform of P V sG(t) = k in θ. Using (5.1) the first part follows immediately by simple calculations. For the second part, we just need to take the limit as t → ∞. For the sequence (p k ) k∈N , the result is immediate since The case of affine PA weights f k = ak + b is particularly nice. As already mentioned in Section 2, the process (M t ) t≥0 has a power-law distribution at every t ∈ R + and (2.12) follows immediately. Further, (2.13) and (2.14) follow directly.
A. Limiting distribution with aging effect, no fitness In this section, we analyze the limiting degree distribution (p k ) k∈N of CTBPs with aging but no fitness. In Section A.1 we prove the adapted Laplace method for the general asymptotic behavior of p k . In Section A.2 we consider some examples of aging function g, giving the asymptotics for the corresponding distributions.
A.1 Proofs of Lemma 5.1 and Proposition 5.2.
Proof of Lemma 5.1. First of all, we show that t k is actually a minimum. In fact, As a consequence, t k is a minimum. Then, In particular, g(t k ) is of order 1/k. Then, since t k is the actual minimum, and g is monotonically decreasing for t ≥ B, We use the fact that we are evaluating the second derivative in the point t k where the first derivative is zero. This means We use this in (A.2) to obtain Now, we use Taylor expansion around t k of Ψ k (t) in the integral in (5.5). Since we use the expansion around t k , which is the minimum of Ψ k (t), the first derivative of Ψ k is zero. As a consequence, we have First of all, notice that the contribution of the terms with |t − t k | 1 is negligible. In fact, we have which means that such terms are exponentially small, so we can ignore them. Now we make a change of variable u = t − t k . Then In particular, since the term e −kΨ k (t k ) does not depend on u, we can write the dynamics of power laws We use the notation kΨ k (t k ) = 1 σ 2 k , which means we can rewrite the integral as Since the distribution N (0, σ 2 k ) is symmetric with respect to 0, for every k ∈ N, The behavior of the above integral depends on the ratio t k /σ k , which is bounded between 0 and 1. As a consequence, the term P N (0, σ 2 k ) ≤ t k is bounded between 1/2 and 1.
Using Lemma 5.1, we can prove Proposition 5.2: Proof of Proposition 5.2. Recall that σ 2 k = (kΨ(t k ) ) −1 . Using (A.3), the fact that g is bounded almost everywhere, and g (t k ) < 0, we can write Notice that in (A.6) the terms g(t k ) − g (t k ) g(t k ) are always strictly positive, since g(t) is decreasing and t k → ∞ as k → ∞. As a consequence, we can replace the term 2π/σ 2 k by Cg(t k ) − g (t k ) , for C = a(2−e −aG(∞) ) 1−e −aG(∞) . We also have that since G(t k ) converges to G(∞). For the term in (A.5), it is easy to show that it is asymptotic to D k (g). This completes the proof.
A.2 Examples of aging functions. In this section, we analyze two examples of aging functions, in order to give examples of the limiting degree distribution of the branching process. We consider affine weights f k = ak + b, and three different aging functions: We assume that in every case the aging function g is integrable, so we consider λ > 0 for the exponential case, λ > 1 for the power-law case and λ 1 , λ 2 , λ 3 > 0 for the lognormal case. We assume that g satisfies Condition (2.5) in order to have a supercritical process. We now apply (5.7) to these three examples, giving their asymptotics. In general, we approximate t k with the solution of, for c 1 = ae −aG(∞) 1−e −aG(∞) , We start considering the exponential case g(t) = e −λt . In this case, from (A.7) we obtain that, ignoring constants, garavaglia, van der hofstad, woeginger As we expected, t k → ∞. We now use (A.6), which gives a bound on σ 2 k in (A.1) in terms of g and its derivatives. As a consequence, (1)).
Looking at e −kΨ k (t k ) , it is easy to compute that, with t k as in (A.8), (1)).
Since t k /σ k → ∞, then P N (0, σ 2 k ) ≤ t k → 1, so that which means that p k has an exponential tail with power-law corrections. We now apply the same result to the power-law aging function, so g(t) = (1 + t) −λ , and (1)).
In conclusion, which means that also in this case we have a power-law with exponential truncation. In the case of the lognormal aging function, (A.7) implies that By (A.1) we can say that (1)).

B. Limiting distribution with aging and fitness
In this section, we consider birth processes with aging and fitness. We prove Lemma 5.4,used 1 − e −s k aG(t k ) .
C. Proof of propositions 5.6 and 5.7 In the present section, we prove Propositions 5.6 and 5.7. These proofs are applications of Proposition 5.5, and mainly consist of computations. In the proof of the two propositions, we often refer to Appendix B.2 for expressions regarding the Hessian matrix of Ψ k (t, s) as in (5.11).
C.1 Proof of Proposition 5.6. We start by proving the existence of the dynamical power-law. We already know that In order to give asymptotics on J(k) as in (C.2), we can use a Laplace method similar to the one used in the proof of Lemma 5.1, but the analysis is simpler since in this case ψ k (s) is a function of only one variable. The idea is again to find a minimum point s k for ψ k (s), and to use Taylor expansion inside the integral, so We can ignore the contribution of the terms where (s − s k ) 2 1, since e −kψ k (s) ≤ e −bsG(t) , so that the error is at most exponentially small. As a consequence, J(k) = π ψ k (s k ) e −kψ k (s k ) (1 + o(1)). In particular, s k satisfies the following equality, which is similar to (B.9): . In particular, this implies s k = 1 aG(t) log 1 + k aG(t) bG(t) + θ (1 + o (1)). (C.8) Similarly to the element (kH k (t k , s k )) 2,2 in (B.8), aG(t) 1 − e −as k G(t) + abG(t) 2 1 − e −as k G(t) . (C.9) For the general exponential class, the ratio As a consequence, k d 2 ψ k (s) ds 2 converges to a positive constant, which means that s k is an actual minimum. Then J(k) = c 1 e −kψ k (s k ) (1 + o (1)). Using this in (C.1) and ignoring the constants, which is a power-law distribution with exponent τ (t) = 1 + θ/aG(t), and minor corrections given by h(s k ). This holds for every t ≥ 0. In particular, considering G(∞) instead of G(t), with the same argument we can also prove that the distribution of the total number of children obeys a power-law tail with exponent τ (∞) = 1 + θ/aG(∞).
We now prove the result on the limiting distribution (p k ) k∈N of the CTBP, for which we apply directly Proposition 5.5, using the analysis on the Hessian matrix given in Section B.2. First of all, from (B.9) it follows that s k = 1 aG(t k ) log 1 + k aG(t k ) bG(t k ) + θ (1 + o(1)), (C.11) and by (B.10) For the Hessian matrix, using (C.11) and (C.11) in (B.8), for any integrable aging function g we have (kH k (t k , s k )) 2,2 = C 2 + o(1) > 0, and (kH k (t k , s k )) 2,1 = o(1), but (kH k (t k , s k )) 1,1 behaves according to g (t k )/g(t k ). If this ratio is bounded, then (kH k (t k , s k )) 1,1 = C 1 + o(1) > 0, while (kH k (t k , s k )) 1,1 → ∞ whenever g (t k )/g(t k ) diverges. In both cases, (t k , s k ) is a minimum. In particular, again ignoring the multiplicative constants and using (C.11) and (C. 12) in the definition of Ψ k (t, s), the limiting degree distribution of the CTBP is asymptotic to k −(1+θ/(aG(t k ))) h(s k )e −α * t k C − α * g (t k ) g(t k ) −1/2 , (C. 13) where the term C − α * g (t k ) g(t k ) −1/2 , which comes from the determinant of the Hessian matrix, behaves differently according to the aging function. With this, the proof of Proposition 5.6 is complete.
the dynamics of power laws C.2 Proof of Proposition 5.7. This proof is identical to the proof of Proposition 5.6, but this time we consider a sub-exponential distribution. First, we start looking at the distribution of the birth process at a fixed time t ≥ 0. We define ψ k (s) and J(k) as in (C.3) and (C.2). We use again (C.1), so s k = 1 aG(t) log 1 + k aG(t) bG(t) − µ (s k )/µ(s k ) .