1. Introduction

Certain population structures allow selection to act on multiple levels. If a meta-population is subdivided into groups, there can be selection between individuals in a group and selection between groups. Many theoretical and empirical studies of group selection have been performed. Until the 1960s, it was a routine assumption that selection acts not only on the individual, but also on the group level (Wynne-Edwards, 1962). This idea goes back to Charles Darwin (1871), who wrote “There can be no doubt that a tribe including many members who […] were always ready to give aid to each other and to sacrifice themselves for the common good, would be victorious over other tribes; and this would be natural selection.”

Williams (1966) pointed out some problems in this argumentation and subsequently, many biologists dismissed the possibility of group selection. Mathematical models can show the limits of group selection (Maynard Smith, 1964; Wilson, 1975; Levin and Kilmer, 1974; Matessi and Jayakar, 1976). Wilson (1987) has shown that the failure of group selection in the Haystack model of Maynard Smith (1964) hinges on the assumption that groups stay intact until non-cooperators have taken over all mixed groups. Fletcher and Zwick (2004) have shown that in Hamilton’s group selection model (Hamilton, 1975) cooperation can evolve if groups stay intact for several generations. In many other models of group selection, cooperation can evolve (Eshel, 1972; Uyenoyama, 1979; Slatkin, 1981; Leigh, 1983; Wilson, 1983; Killingback et al., 2006; Traulsen and Nowak, 2006). Experiments have shown that artificial group selection can be effective (Wade, 1976; Craig and Muir, 1996; Swenson et al., 2000).

Kerr and Godfrey-Smith (2002) have argued that a group selection perspective can be helpful under many circumstances. Group selection arguments have been invoked for the evolution of the first cell (Szathmáry and Demeter, 1987; Maynard and Szathmáry, 1995) and for optimizing the number of plasmids in bacterial cells (Paulsson, 2002). Group selection might also have played an important role in human evolution (Wilson and Sober, 1998; Boyd and Richerson, 2002; Bowles, 2004; Bowles and Gintis, 2004; Weibull and Salomonsson, 2006; Traulsen and Nowak, 2006; Bowles, 2006). A recent paper by Wilson and Hölldobler (2005) argues that group selection is more important than kin selection for the evolution of social insects, see also (Reeve and Hölldobler, 2007). Wilson (2007) further questions the importance of kin selection for eusociality.

For some authors, group selection and kin selection are identical concepts (Lehmann et al., 2007). While there could be some overlap between these two mechanisms, we do not consider this to be a useful perspective, in general (Wild and Traulsen, 2007; Taylor and Nowak, 2007). We will return to this topic in the discussion.

Most models of multi-level selection are mathematically very complicated and can only be studied by computer simulation. Here, we consider a simple model that was introduced by Traulsen et al. (2005) and Traulsen and Nowak (2006). We show that this model leads to analytical results for any intensity of selection, if fitness is an exponential function of payoff.

This paper is organized as follows: In Section 2, we recall the frequency dependent Moran process describing a single, well mixed population and discuss the limit of weak selection, where the payoff of the game has only a small effect on fitness. In Section 3, we introduce an exponential mapping of payoffs to fitness and show that this approach leads to exact results for any intensity of selection. In Section 4, we turn to group selection and review the standard results obtained for a linear payoff to fitness mapping. In Section 5, we study group selection using the exponential mapping. In Section 6, we discuss the implications of our results.

2. The Moran process

First, we consider frequency dependent selection in a Moran process which describes a single, well mixed population of n individuals (Nowak et al., 2004; Nowak, 2006a). There are two types of individuals, A and B. Individuals interact with others in pairwise encounters in a well-mixed population. They receive a payoff as defined by the matrix

$$\matrix{ {} & A & B \cr A & {(a} & {b)} \cr B & {(c} & {d)} \cr } $$

. The expected payoff π A (j) of an A individual in a well-mixed population of j − 1 other A individuals and nj B individuals is

$${\pi _A}(j) = {{j - 1} \over {n - 1}}a + {{n - j} \over {n - 1}}b$$

. Similarly, the B individuals in such a population have the payoff

$${\pi _B}(j) = {j \over {n - 1}}c + {{n - j - 1} \over {n - 1}}d$$

. As usual, we first assume that fitness, f, is a linear combination of a background fitness (which is set to 1) and the payoff,

$${f_A}(j) = 1 - w + w{\pi _A}(j){\rm{ and }}{f_B}{\rm{(}}j{\rm{) = 1 - }}w{\rm{ + }}w{\pi _B}(j)$$

. The parameter w controls the intensity of selection. For w = 0, there is only neutral drift. For w ≪ 1, we have weak selection. For w = 1, fitness equals payoff and selection is strong. But for strong selection, we have the restriction that f A (j) and f B (j) must be nonnegative for any j ∈ {0, 1, …, n}. Thus, there is an upper limit for w if the payoff matrix has negative entries. This complication arises because the frequency dependent Moran process is in contrast to the replicator dynamics not invariant under the addition of a constant to the payoff matrix.

At each time step, one individual is selected at random proportional to fitness and produces an identical offspring, which replaces a randomly chosen individual. The probabilities to change the number of A individuals from j to j ± 1 are given by

$${T^ + }(j) = {{j{f_A}(j)} \over {j{f_A}(j) + (n - j){f_B}(j)}}{{n - j} \over n}$$

,

$${T^ - }(j) = {{(n - j){f_B}(j)} \over {j{f_A}(j) + (n - j){f_B}(j)}}{j \over n}$$

. With probability 1 − T +(j) − T (j), the number of j individuals does not change. The fact that the remaining transition probabilities are zero (j changes at most by one) allows us to calculate the fixation probabilities analytically. In a stochastic process where j can change to any value as in the frequency dependent Wright Fisher process (Imhof and Nowak, 2006), the fixation probabilities can only be approximated.

For the ratio of the transition probabilities (5) and (6), we have

$${{{T^ - }(j)} \over {{T^ + }(j)}} = {{{f_B}(j)} \over {{f_A}(j)}} = {{1 - w + w{\pi _B}(j)} \over {1 - w + w{\pi _A}(j)}}$$

. This ratio measures at each point in state space how likely it is that the process continues in a certain direction: If the ratio is close to zero, then it is more likely that the number of A individuals increases. If it is very large, the number of A individuals will probably decrease. If it is one, then increase and decrease of the number of A individuals are equally likely.

The fixation probability of a single A individual in a group of n − 1 B individuals is given by Karlin and Taylor (1975)

$${\phi _A} = {1 \over {1 + \sum\nolimits_{k = 1}^{n - 1} {\prod\nolimits_{j = 1}^k {{{{T^ - }(j)} \over {{T^ + }(j)}}} } }} = {1 \over {1 + \sum\nolimits_{k = 1}^{n - 1} {\prod\nolimits_{j = 1}^k {{{{f_B}(j)} \over {{f_A}(j)}}} } }}$$

. In contrast to the transition probabilities, fixation probabilities are nonlocal in state space, i.e., all transition probabilities enter in the fixation properties.

The fixation probability of a single B individual in a group of n − 1 A individuals can be calculated as (Nowak, 2006a)

$${\phi _B} = {{\prod\nolimits_{j = 1}^{n - 1} {{{{f_B}(j)} \over {{f_A}(j)}}} } \over {1 + \sum\nolimits_{k = 1}^{n - 1} {\prod\nolimits_{j = 1}^k {{{{f_B}(j)} \over {{f_A}(j)}}} } }}$$

. Because of the sums and products, the fixation probabilities (8) and (9) are difficult to interpret. Taking only linear terms in the intensity of selection w into account, we can derive a weak selection approximation (w≪ 1). We obtain

$${\phi _A} \approx {1 \over n} + {w \over {6n}}\left[ { - 2a - b - c + 4d + n(a + 2b - c - 2d)} \right]$$

,

$${\phi _B} \approx {1 \over n} + {w \over {6n}}\left[ {4a - b - c - 2d + n( - 2a - b + 2c + d)} \right]$$

. The comparison of these fixation probabilities to neutral mutants, which have a fixation probability of 1/n is straightforward from these equations. In this case, the 1/3-rule is obtained (Nowak et al., 2004; Taylor et al., 2004; Traulsen et al., 2006b; Ohtsuki and Nowak, 2007; Ohtsuki et al., 2007a). This rule is valid for weak selection and large populations. It can be stated as follows: The fixation probability of A is greater than 1/n, if the fitness of A is greater than the fitness of B at the point where the frequency of A is 1/3. This rule is valid for any process within the domain of Kingman’s coalescence (Lessard and Ladret, 2007).

Comparing the fixation probabilities to each other, we have

$${{{\phi _B}} \over {{\phi _A}}} = \prod\limits_{j = 1}^{n - 1} {{{{f_B}(j)} \over {{f_A}(j)}}} \approx 1 - w\left[ {{n \over 2}(a + b - c - d) - a + d} \right]$$

. The approximation is valid for weak selection, w ≪ 1. Note that every single transition probability enters into this ratio. For weak selection, ϕ A > ϕ B is equivalent to

$${n \over 2}(a + b - c - d) - a + d > 0$$

. For large n, this reduces to risk-dominance of A,

$$a + b > c + d$$

. A risk dominant strategy can be defined as the Nash equilibrium with the larger basin of attraction. If the amount of noise in the system increases, it is more likely that the system is found in the risk dominant equilibrium. For larger w, the relation between risk dominance and the fixation probabilities can be more complicated (Nowak et al., 2004; Fudenberg et al., 2006).

Here, we restrict the discussion to games where coexistence of two strategies, as in the snowdrift game or hawk-dove game (Hauert and Doebeli, 2004), is not possible. In such cases, the fixation times can become extremely long (Traulsen et al., 2007; Antal and Scheuring, 2006).

3. A new mapping of payoff to fitness

While the frequency dependent Moran process (as introduced in Nowak et al., 2004) has convenient properties for weak selection, it is less useful for analyzing strong selection (Fudenberg et al., 2006). Now, we assume that (relative) fitness is an exponential function of the payoff,

$${f_A}(j) = {e^{\beta {\pi _A}(j)}}$$

and

$${f_B}(j) = {e^{\beta {\pi _B}(j)}}$$

. The parameter β measures the intensity of selection, similar to w above. As before, the fitness increases with the payoff. But in contrast to w, the parameter β can take any positive value. For β = 0, we obtain neutral drift. For small β, the exponential function can be approximated by a linear function. Therefore, we recover the usual results for weak selection, Eqs. (10)–(12) (Nowak et al., 2004; Traulsen et al., 2006b). For large β, we can analyze the effect of strong selection. In this case, negative payoffs lead to a relative fitness close to zero and positive payoffs lead to very large values for the relative fitness. Since the exponential function is positive for any argument, negative and positive entries in the payoff matrix can be analyzed without restrictions.

Usually, a linear relation between payoff and fitness is assumed with no further justifi- cation. In most cases, a different payoff to fitness mapping does not change the qualitative outcome, only the speed of the process. An exponential mapping from payoff to fitness has exactly the same properties as a linear mapping in most cases, but it allows greater variation in the intensity of selection and is thus more general. Exponential functions to calculate fitness from model parameters have been used before (Aviles, 1999).

With the new mapping, the probabilities to change the number of A individuals from j to j ± 1 are given by

$${T^ + }(j) = {{j{e^{\beta {\pi _A}(j)}}} \over {j{e^{\beta {\pi _A}(j)}} + (n - j){e^{\beta {\pi _B}(j)}}}}{{n - j} \over n}$$

,

$${T^ - }(j) = {{(n - j){e^{\beta {\pi _B}(j)}}} \over {j{e^{\beta {\pi _A}(j)}} + (n - j){e^{\beta {\pi _B}(j)}}}}{j \over n}$$

. For the ratio of transition probabilities, we obtain

$${{{T^ - }(j)} \over {{T^ + }(j)}}{{{f_B}(j)} \over {{f_A}(j)}} = {e^{\beta ({\pi _B}(j) - {\pi _A}(j))}}$$

. This is identical to the corresponding ratio of the pairwise comparison process discussed by Blume (1993), Szabó and Tőke (1998) and Traulsen et al. (2006a, 2007). Thus, both processes have exactly the same fixation probabilities, despite the fact that they are very different in general. For example, here only the fittest individuals reproduce for strong selection, whereas in the pairwise comparison process both types can reproduce. Since the product in Eq. (8) can now be solved exactly, the fixation probability of a single A individual reduces to

$${\phi _A} = {\left( {\sum\limits_{k = 0}^{n - 1} {\exp \left[ {{\beta \over 2}k(k + 1){{ - a + b + c - d} \over {n - 1}} + \beta k{{a - bn + dn - d} \over {n - 1}}} \right]} } \right)^{ - 1}}$$

. Equivalently, we find for the fixation probability of a single B individual in a population of N − 1 A individuals

$${\phi _B} = {\left( {\sum\limits_{k = 0}^{n - 1} {\exp \left[ {{\beta \over 2}k(k + 1){{ - a + b + c - d} \over {n - 1}} + \beta k{{an - a - cn + d} \over {n - 1}}} \right]} } \right)^{ - 1}}$$

. For a + d = b + c, the sums can be calculated exactly. In this case, we obtain

$${\phi _A} = {{\exp \left[ {{{\beta (a - bn + dn - d)} \mathord{\left/ {\vphantom {{\beta (a - bn + dn - d)} {(n - 1)}}} \right. \kern-\nulldelimiterspace} {(n - 1)}}} \right] - 1} \over {\exp \left[ {{{\beta (a - bn + dn - d)n} \mathord{\left/ {\vphantom {{\beta (a - bn + dn - d)n} {(n - 1)}}} \right. \kern-\nulldelimiterspace} {(n - 1)}}} \right] - 1}}$$

. For a + db + c, we can replace the sum by an integral to obtain a closed expression for the fixation probabilities

$${\phi _A} = {{{\rm{erf}}[{\xi _1}] - {\rm{erf}}[{\xi _0}]} \over {{\rm{erf}}[{\xi _n}] - {\rm{erf}}[{\xi _0}]}}$$

. We have \({\xi _k} = \sqrt {{\beta \over u}} (ku + \upsilon )\), \(2u = {{(a - b - c + d)} \mathord{\left/ {\vphantom {{(a - b - c + d)} {(n - 1) \ne 0}}} \right. \kern-\nulldelimiterspace} {(n - 1) \ne 0}}\) and \(2\nu = {{( - a + bn - dn + d)} \mathord{\left/ {\vphantom {{( - a + bn - dn + d)} {(n - 1)}}} \right. \kern-\nulldelimiterspace} {(n - 1)}}\). The error function is given by \({\rm{erf}}(x) = {2 \over {\sqrt \pi }}\int_0^x {dy{e^{ - {y^2}}}} \). In Traulsen et al. (2006a, 2007), it is shown that this approximation works very well even in small populations. Similar equations hold for ϕ B . They can be obtained by exchanging ad and bc.

For the ratio of fixation probabilities, we find

$${{{\phi _B}} \over {{\phi _A}}} = \prod\limits_{j = 1}^{n - 1} {{{{T^ - }(j)} \over {{T^ + }(j)}}} = \exp \left[ { - \beta \left( {{n \over 2}(a + b - c - d) - a + d} \right)} \right]$$

. Again, ϕ A > ϕ B is equivalent to

$${n \over 2}(a + b - c - d) - a + d > 0$$

. But now, this condition is valid for any intensity of selection. Thus, for the exponential payoff to fitness mapping, we find that ϕ A > ϕ B and risk dominance of A are equivalent for any intensity of selection in large populations.

4. Group selection

Group selection is a process where competition occurs between individuals and between groups. It is a mechanism for the evolution of cooperation (Nowak, 2006b; Taylor and Nowak, 2007).

Imagine a population of individuals that is subdivided into groups. The number of groups is constant and given by m. Each group contains between one and n individuals. The total population size, N, can fluctuate between the bounds m and nm.

In each time step, a random individual from the entire population is chosen for reproduction proportional to fitness. The offspring is added to the same group. If the new group size is less than or equal to n, nothing else happens. If the group size exceeds n, then with probability q, the group splits into two. In this case, a random group is eliminated in order to maintain a constant number of groups. With probability 1 − q, however, the group does not divide, but instead a random individual from that group is eliminated (Traulsen and Nowak, 2006).

This minimalist model of multi-level selection has some interesting features. Note that the evolutionary dynamics are entirely driven by individual properties. Only individuals are assigned payoff values. Only individuals reproduce and group splitting is triggered by individual reproduction. Groups can stay together or split when reaching a certain size. Groups that contain fitter individuals reach the critical size faster and, therefore, split more often. This concept leads to selection among groups, although only individuals reproduce. Higher level selection emerges from lower level reproduction. The two levels of selection can oppose each other (Williams, 1966; Wilson, 1975; Hamilton, 1975; Traulsen et al., 2005).We note in passing that the underlying population structure cannot be described by evolution on a fixed graph (Nowak and May, 1992; Lieberman et al., 2005; Ohtsuki et al., 2006, 2007b; Taylor et al., 2007).

While many group selection models consider group competition in terms of differential productivity of groups, here groups are eliminated and successful groups divide. Similar mechanisms where whole groups are taken over have been described before (Bowles et al., 2003; Chalub et al., 2006; Pacheco et al., 2006; Bowles, 2006).

We can compute the fixation probabilities. An analytic calculation is possible in the limit q ≪ 1 where a separation of time scales emerges, as individuals reproduce much more rapidly than groups divide. In this case, most of the groups are at their maximum size, and hence the total population size is almost constant and given by N = nm. We have a hierarchy of two Moran processes. Fixation of a mutant implies first fixation within the group and then fixation of the group’s strategy in the population. On a fast time scale, we have a frequency dependent Moran process within each of the groups. We have described this process in detail in Section 2. On a slower time scale, we have a frequency independent Moran process among pure groups. Once all groups are homogeneous, they stay homogeneous, as no mixing between groups occurs (for a model with migration, see Traulsen and Nowak, 2006).

The probability to change the number of all-A groups from l to l +1 is given by

$${P^ + }(l) = {{l{f_A}(n)} \over {l{f_A}(n) + (m - l){f_B}(0)}}{{m - l} \over m}$$

. At first, we use a linear payoff to fitness mapping, \({f_A} = 1 - w + w{\pi _A}\) and \({f_B} = 1 - w + w{\pi _B}\). The probability to change the number of all-A groups from l to l − 1, P (l), is given by

$${P^ - }(l) = {{(m - l){f_B}(0)} \over {l{f_A}(n) + (m - l){f_B}(0)}}{l \over m}$$

. The ratio of these probabilities reduces to

$${{{P^ - }(l)} \over {{P^ + }(l)}} = {{{f_B}(0)} \over {{f_A}(n)}}$$

. In contrast to the equivalent expression for the dynamics within a single group, this quantity is independent of l. The probability that a single all-A group takes over the population is

$${{\rm{\Phi }}_A} = {\left[ {1 + \sum\limits_{k = 1}^{m - 1} {\prod\limits_{l = 1}^k {{{{f_B}(0)} \over {{f_A}(n)}}} } } \right]^{ - 1}}$$

.

If we add a single A individual to a population of B, then the A individual must first take over its group. Subsequently, this group of A must take over the entire population. Thus, the overall fixation probability of this process, ρ A , is the product of the fixation probability of an individual in a group, ϕ A , and the fixation probability of the group in the population, Φ A . We have ρ A = ϕ A · Φ A .

We find

$${\rho _A} = {\left[ {1 + \sum\limits_{k = 1}^{n - 1} {\prod\limits_{j = 1}^k {{{{f_B}(j)} \over {{f_A}(j)}}} } } \right]^{ - 1}} \times {\left[ {1 + \sum\limits_{k = 1}^{m - 1} {\prod\limits_{l = 1}^k {{{{f_B}(0)} \over {{f_A}(n)}}} } } \right]^{ - 1}}$$

. An equivalent expression holds for ρ B . For weak selection, w ≪ 1, we obtain

$${\rho _A} \approx {1 \over {nm}} + {w \over {6nm}}\left[ { - 2a - b - c + 4d + n(a + 2b - c - 2d) + 3(m - 1)(a - d)} \right]$$

. For the fixation probability of a single B individual in a group structured population we find under weak selection

$${\rho _B} \approx {1 \over {nm}} + {w \over {6nm}}\left[ {4a - b - c - 2d + n( - 2a - b + 2c + d) + 3(m - 1)( - a + d)} \right]$$

. The comparison of the fixation probabilities to the result for neutral selection, 1/(nm), is straightforward under weak selection and follows directly from Eqs. (31) and (32). We can also compare ρ A to ρ B . Then ρ A > ρ B is equivalent to

$$2(m - 2)(a - d) + n(a + b - c - d) > 0$$

. This result has been derived before; see Eq. (22) in the supporting information of Traulsen and Nowak (2006). For the special case of a two parameter Prisoner’s dilemma described by costs and benefits, Eq. (33) means that the benefit to cost ratio of cooperation has to exceed 1+n/(m− 2), see Traulsen and Nowak (2006). In this special case, the condition is under weak selection equivalent to ρ A > 1/(nm) and to ρ B < 1/(nm)

If the number of groups becomes very large compared to the group size, mn, then A individuals have a higher probability of fixation than B individuals if a > d. In this case, mixed groups do not influence the dynamics and the Pareto optimal equilibrium is favored. If the number of groups is much smaller than the group size, mn, then A individuals have a higher probability of fixation than B individuals if A is risk dominant in a single, well-mixed population a + b > c + d (Nowak et al., 2004; Nowak, 2006a; Fudenberg et al., 2006). In the general case of finite n and finite m, the full condition (33) determines which fixation probability is larger.

5. The new mapping in the case of group selection

We now use an exponential payoff to fitness mapping for group selection. The probabilities to change the number of all-A groups from l to l ± 1 are given by

$${P^ + }(l) = {{l{e^{\beta {\pi _A}(n)}}} \over {l{e^{\beta {\pi _A}(n)}} + (m - l){e^{\beta {\pi _B}(0)}}}}{{m - l} \over m}$$

,

$${P^ - }(l) = {{(m - l){e^{\beta {\pi _B}(0)}}} \over {l{e^{\beta {\pi _A}(n)}} + (m - l){e^{\beta {\pi _B}(0)}}}}{l \over m}$$

. The ratio of these probabilities simplifies to

$${{{P^ - }(l)} \over {{P^ + }(l)}} = {{{f_B}(0)} \over {{f_A}(n)}} = {e^{\beta ({\pi _B}(0) - {\pi _A}(n))}}$$

. We obtain for the fixation probability of a single all-A group

$${\Phi _A} = {{1 - {e^{ - \beta (a - d)}}} \over {1 - {e^{ - \beta (a - d)m}}}}$$

and for the fixation probability of a single all-B group

$${\Phi _B} = {{1 - {e^{\beta (a - d)}}} \over {1 - {e^{\beta (a - d)m}}}}$$

. Combining these expressions with Eqs. (20) and (21), we obtain the overall fixation probabilities ρ A = ϕ A Φ A and ρ B = ϕ B Φ B again. For weak selection, β ≪ 1, we recover for ρ A and ρ B the approximations (31) and (32) with βw. Thus, under weak selection both processes have the same fixation properties. In this limit, it is reasonable to compare the fixation probability to a neutral mutant, which is the natural result for β → 0 (Nowak et al., 2004; Antal and Scheuring, 2006; Traulsen and Nowak, 2006; Ohtsuki et al., 2006)

For strong selection, the comparison with the fixation probability of a neutral mutant is no longer meaningful. For instance, consider the interactions of cooperators and defectors. On the individual level, defectors perform better than cooperators, but a group of cooperators is better off than a group of defectors. For strong selection, a single cooperator will hardly ever reach fixation within a group, whereas a single defector group will hardly ever reach fixation in a population of cooperator groups. Thus, many attempts are necessary before either a cooperator or a defector can take over the whole population. Consequently, both ρ A and ρ B become very small as β → ∞. However, we can compare the fixation probabilities of the two strategies directly to each other for any intensity of selection. For any finite β, such a comparison is meaningful, but one has to keep in mind that for large β the fixation probabilities are small and many mutations are necessary before one of them reaches fixation.

Our analytical calculation is valid for small group splitting probability, q ≪ 1, only, but simulations show that larger values of q favor cooperators. The reason for this is that for larger q, groups tend to be smaller, as not all groups grow back to their carrying capacity. In addition, in this case often mixed groups split, which allows cooperators to take over groups without reaching fixation.

For the ratio of the two fixation probabilities, we obtain

$${{{\rho _B}} \over {{\rho _A}}} = \exp \left[ { - \beta \left( {{n \over 2}(a + b - c - d) + (m - 2)(a - d)} \right)} \right]$$

.

The detailed calculation can be found in Appendix A. Hence, ρ A > ρ B if

$$2(m - 2)(a - d) + n(a + b - c - d) > 0$$

. In Section 4, we have derived an identical condition for weak selection, see Eq. (33). Due to our different choice of the payoff to fitness mapping, here the same condition is valid for any intensity of selection. For the special case of a + d = b + c, the conditions ρ A > ρ B , ρ A > 1/(nm) and ρ B < 1/(nm) are equivalent under weak selection. Under strong selection, this is no longer true, as we can have ρ A > ρ B despite ρ A ≫ 1/(nm) and ρ B ≪ 1/(nm)

Our way to address arbitrary intensities of selection is very different from common approaches taking higher order terms in the intensity of selection into account. Higher order terms make the weak selection approximation more accurate, but the reference point remains neutral selection. Because of the limited convergence radius of such expansions, higher-order approximations cannot provide reliable information on general intensities of selection.

6. Discussion

We have introduced a Moran process where the fitness is an exponential function of payoff. We have shown that the results of Traulsen and Nowak (2006) can be extended to any intensity of selection if this mapping from payoff to fitness is applied.

It has been argued that the model described in Traulsen and Nowak (2006) describes kin-selection. Wild and Traulsen (2007) and Lehmann et al. (2007) have shown that a special form of Eq. (40) can be derived using an inclusive fitness approach. The inclusive fitness approach uses a within-group and between-group relatedness. Although some of our results can be obtained using the mathematical framework of kin selection, there are several conceptual differences.

Our model considers two distinct pure strategies, A and B. In such a system, any selective scenario is possible on the individual level, see Taylor and Nowak (2007). The two types could engage in a coordination game, leading to a bistable situation. The two types could also form a stable polymorphism, leading to an internal equilibrium. Finally, one type can dominate the other, which is the situation considered in Traulsen and Nowak (2006). Weak selection is implemented in the sense that the two types have similar fitness values despite their distinct phenotypes.

In contrast, kin selection methods determine whether cooperativeness increases in a continuous phenotype space. Weak selection in such a system is usually implemented by considering two types that are close to each other in phenotype space. This concept leads to frequency independent selection if only the linear term in the phenotypic distance is considered (Wild and Traulsen, 2007). When higher order terms are considered, aspects of frequency dependence can be addressed (Ross-Gillespie et al., 2007).

The approach of Rousset and Billiard (2000) for calculating fixation probabilities in a kin selection framework hinges on the assumption of frequency independent selection. Hence, it does not necessarily lead to the same fixation probabilities for weak selection as in Traulsen and Nowak (2006). In fact, the same condition for the ratio of the fixation probabilities is only obtained if the payoff matrix fulfills a+d = b+c (Wild and Traulsen, 2007), which is a special situation that has been termed “equal gains from switching” (Nowak and Sigmund, 1990). Thus, our more general Eq. (40) cannot be obtained by current kin selection approaches, as the different weak selection assumption changes the condition.

Furthermore, inclusive fitness arguments use weak selection to decouple relatedness and fitness effects. Relatedness coefficients are calculated in a system without selection. Thus, for inclusive fitness calculations, weak selection is a necessity to avoid the intricacies of calculating relatedness in a system with strong selection, which is usually an insurmountable task. In contrast, for the approach of Traulsen and Nowak (2006), there is no necessity to consider the weak selection limit. Weak selection only serves to simplify the fixation probabilities and to obtain Eq. (33), which is easier to interpret than Eq. (30). In the present paper, we derive results that hold for any intensity of selection, see Eq. (40).

The mathematical methods of game theory in finite populations and inclusive fitness are very different and lead to the same results only in special cases. Inclusive fitness methods either use direct or indirect fitness evaluation, but this seems to be rather a different way to do the accounting (Fletcher et al., 2006; Fletcher and Zwick, 2006). The standard approach of evolutionary game theory is usually much simpler and more direct than the approach of inclusive fitness theory. Often the inclusive fitness approach leads only to a subset of the results and does not provide additional insights (An important exception, however, is the paper by Taylor et al., 2007, which leads to correction terms for finite population size that could not be reached by Ohtsuki et al., 2006.) Evolutionary game theory analyses the frequency dependent selection between two strategies, A and B. Inclusive fitness theory assumes a continuum of mixed strategies between A and B and then studies the direction of selection in this continuous strategy space. This approach has two problems: (i) for many games, mixed strategies are not meaningful; and (ii) such a local analysis need not have any implication for the original question concerning frequency dependent selection between the strategies A and B.

Finally, the biological concepts of group selection and kin selection should be kept distinct. Group selection arises when there is competition between groups and does not necessarily depend on genetic relatedness or genetic reproduction. The term kin selection was originally defined by Maynard Smith (1964) as referring to situations of genetic reproduction. Conditional strategies, such as different behavior toward siblings and cousins, depend on kin recognition. The idea of kin selection has given rise to a general method of analysis of structured populations (which can be useful, see Taylor et al., 2007), but the mathematical method should not be confused with the biological mechanism (West et al., 2007). Essentially, kin selection analysis captures the effects of assortment, which is a consequence of any mechanism for evolution of cooperation (Taylor and Nowak, 2007). The evolutionary dynamics of group selection and graph selection (Ohtsuki et al., 2006) are very different, although some aspects of both can be captured by inclusive fitness calculations. Kin selection models that are independent of group selection and graph selection should work in well mixed populations based on kin recognition.

The purpose of this paper was to show that an exponential payoff to fitness mapping allows analytical results for any intensity of selection, both in settings of individual and multi-level selection.