Introduction

Survival and reproductive success are the two quintessential components of life-time reproductive success. Eusocial insects are no exception to this view—but at least in some groups, matters are complicated by the fact that alternative types of reproduction exist, namely independent (ICF) and dependent colony foundation (DCF; Cronin et al. 2013). By ICF, new colonies are founded by single fertilized queens (in termites by queens and kings), whereas with DCF, a section of an existing colony splits off together with one or several queen(s) to initiate a new and autonomous colony, thus bypassing the initial solitary phase characterizing ICF. As split-off colonies could be of any size, we may consider ICF only as the smallest possible size of a founding daughter colony on a continuum of possible investments strategies; looked at it in this way, the problem can be phrased in analogy to the offspring size vs. number trade-off (Smith and Fretwell 1974). Reproduction by DCF comes under different names like “budding” or “swarming”. In this text, we will mostly use the term “colony fission” or just “fission” and the term “swarm” for the separating, new colony. Colony fission occurs in many different species and has evolved independently and several times in wasps, bees, and ants (Cronin et al. 2013). In some stingless bees, the separation of nests seems to be more gradual with an exchange between mother and daughter colony maintained over considerable time (Wille 1983, p. 52).

It is argued that DCF evolved as a strategy to avoid the particular risks, hardships and efficiency deficits solitary queens or young (small) colonies are exposed to (Keller 1991; Debout et al. 2007; Cronin et al. 2013; Jeanne et al. 2022). Such risks include the difficulty of establishing a new nest by a single individual, time-constraints on ICF founded colonies when season length becomes short (Onoyama 1981; Hölldobler and Wilson 1990; Heinze et al. 1996; Cronin et al. 2020), high risks of predation, the stochastic risk of complete colony failure if the number of individuals is very small (Choe 2010), but also the competitive weakness against larger, established colonies of the same or other species (Cronin et al. 2016a, b; Peeters and Molet 2010; Tilman 1994) and the limitation of suitable nest sites if nest densities are high (Kokko and Lundberg 2001). ICF strategies have been modified to mitigate these disadvantages (see Cronin et al. 2013), e.g., by evolving claustral ICF where the young queen carries enough resources to provide the first generation of brood without taking the risk of foraging herself. In other cases, young queens (related or unrelated) join forces in founding new colonies (pleometrosis) thus short-cutting the solitary phase altogether. Yet, in neither case will the founding success of such ICF variants be similar to those achieved by DCF with a truly larger worker number. Consequently, in ants and termites, often \(\le 1\%\) of independently colony founding queens are successful (Hölldobler and Wilson 1990; Tschinkel 2006; Cronin et al. 2013).

However, DCF is also associated with some potential disadvantages. First, DCF limits dispersal in species with wingless workers like ants and termites (Cronin et al. 2013, 2016a; Peeters and Molet 2010)—only bees and wasps have the ability to swarm over larger distances as workers are capable of flying. Limited dispersal enhances kin-competition (Cronin et al. 2016b) and may limit options to spread risks or colonize new habitats (den Boer 1990; Cronin et al. 2016b); see Medeiros and Araújo 2014 for a case study) and may also be a poor strategy if suitable nest sites are limited in the near distance (Peeters and Aron 2017). Further, DCF is associated with substantial investment costs when releasing a reproductive unit. In the case of colony fission, the whole colony splitting off has to be considered the reproductive investment (Pamilo 1991; Amor et al. 2021). Even if sexuals may be more expensive to produce than workers, colonies could nonetheless produce far more young queens (at a faster rate) than split-off colonies. Finally, there may be a more hidden cost associated with fissioning: if (new) colonies need to have a minimum size to establish successfully, fissioning may also interfere with maintaining established (old) colonies at the size that provides maximum productivity as established colonies need to grow beyond a threshold size before splitting into two (or more) colonies (Holway and Case 2000; Walin et al. 2001; Buczkowski and Bennett 2008).

It is worth noting the similarity between fissioning and vegetative growth in plants whereby plants that reproduce vegetatively invest proportionally far more resources into a nearby offspring than into a single seed (Hakala et al. 2019; Gibb et al. 2023). In particular, in very competitive environments (dense plant cover), such strategies may be of benefit as seeds, viz., seedlings may have very limited chances to establish (Crawley 1990; Benson and Hartnett 2006; Elson and Hartnett 2017). However, vegetative growth has the same disadvantages as colony fissioning, in particular limited dispersal promoting competition among kin and hampering the ability to spread risks or colonize new habitats. Presumably, for this reason, many plants are capable of combining vegetative and sexual reproduction (Yang and Kim 2016). Similarly, the combination of ICF and DCF occurs in many ants (Briese 1983; Heinze 1993; Leal and Oliveira 1995; Cronin et al. 2020) with DCF presumably becoming favored as environmental conditions become harsher (Molet et al. 2008; Cronin et al. 2020). In bees and wasps, such mixed strategies are not known (Cronin et al. 2013).

Possibly, the best studied case of colony fissioning is the swarming behavior of honey bees. It is a particular attribute of honey bees that it is the old mother-queen that leaves an established nest with a large fraction of the workers. Rangel and Seeley (2012) found an average swarm fraction of 0.75, i.e., the swarm contains typically 3 times more workers than the colony left behind. They also found a significant positive effect of swarm size and swarm fraction on the growth (i.e., comb built, brood produced, food stored, and weight gained) and the winter survival of swarms, whereas the number of workers remaining in the old nest (with a new queen) had, over a wide range, little effect on colony survival (Rangel et al. 2013). Importantly, the old nest retains all the infrastructure, past investments, and all the brood that exists in the nest at the moment of swarming. One contributing mechanisms determining the optimal swarm size may be the fact that large swarms could discover and advertise nest sites at a faster rate than small swarms without loss of decision accuracy (Schaerf et al. 2013); however, other factors may play a role, and in particular, the fact that the swarm needs to rebuild, in a limited time, all the infrastructure and investments left behind in the old nest (Rangel and Seeley 2012).

A honey bee colony may produce swarms several times during a season (if that is long enough). Winston (1980) found that the number of “after-swarms” increases with the amount of sealed brood at time of first swarming. Mean number of daughter queens per queen at beginning of season was 3.6. First swarms were on average larger (mean \(\sim 16,000\)) than first (mean \(\sim 11,500\)) or second/third (mean \(\sim 4000\)) after-swarms.

It is consequently a topic of interest how a colony’s decision on fissioning affects (colony) fitness. The critical questions are (i) why fissioning (=DCF) is a better strategy than ICF (releasing single queens) at all, (ii) what colony size (worker number) must be reached before fissioning, and (iii) in which proportion the worker force should split up between the swarm and the old colony? The right moment for splitting can then be related to the time span needed for a colony to regrow from one fissioning episode to the next, but may also be affected by external factors like season or resource availability. Rangel et al. (2013), based on a model by Bulmer (1983), developed an inclusive fitness model to investigate the optimal swarm fraction x (the proportion of colony members joining the swarm) in which workers should distribute themselves between the outgoing mother (the swarm) and the remaining colony (with a new queen). They found that the optimal swarm fraction should be about \(x \approx 0.75\). Their prediction is in good agreement with the empirical findings mentioned in the previous paragraph. However, critical evaluation of the model (more on this in the discussion and Supplement) suggests that the relatedness of workers to the offspring of their mother-queen and their sister queen, respectively, has in fact little influence on this ratio compared to role of the survivorship functions relating colony winter survival to swarm size. Further, in army ants, where intra-colonial genetic relationships are similar as in honey bees, colonies typically split into two colonies of about even size, i.e., \(x\approx 0.5\) (Gotwald 1995), and in other ants, the sizes of split-off colonies and swarm fractions are highly variable and sometimes quite small (Cronin et al. 2013, 2016a; Chéron et al. 2011). Finally, the model by Rangel et al. (2013) only specifies the swarm fraction in which the colony’s workers should split up but does not say anything about the absolute colony size at which fissioning should occur, nor why or when fissioning is a better strategy than ICF.

It is the aim of this paper to provide a model that gives answers to the three questions raised in the previous paragraph. Detailed models of colony dynamics with many parameters are suitable for forecasting and a detailed comparison between empirical results and model predictions (Becher et al. 2014). Simpler, strategic models of colony dynamics as proposed by Khoury et al. (2011, 2013) provide a better basis for a strategic analysis, yet using simple models is a surprisingly rare approach primarily used in theoretical studies of honey bee dynamics focusing on reasons for colony failure (e.g., Russell et al. 2013). Here, we present a strategic colony model to describe the dynamics of a colony growing from one fission event to the next. We use the model to identify the factors that (i) determine the absolute worker number at which colonies should split and (ii) the swarm fraction in which the split should occur. The model also (iii) identifies conditions where the optimal swarm size would be 1, i.e., where ICF is a better strategy than DCF and (iv) estimates the time interval between swarming events. We will primarily address the question of optimal resource allocation in regard to colony reproductive output, but we will discuss additional ecological factors that may affect the fitness of ICF vs. DCF and possibly promote mixing of both strategies.

We do not represent or investigate in our model how the decisions on whether to swarm and how to split up a colony are taken, nor how swarms select a new nest site. Other studies have addressed such questions (e.g., Pratt et al. 2002; Laomettachit et al. 2015; Lavallée et al. 2022).

Colony growth model

Finding the optimal colony and swarm size in honey bees or other social insects is closely related to understanding other life-history problems (e.g., Roff 2002, 2008) and resembles the problem of optimizing offspring size under a size–number trade-off (Smith and Fretwell 1974). It can be approached either as a timing problem or as a resource allocation problem. The question about the optimal pattern of worker and sexual production in annual eusocial insects is typically analyzed as a resource allocation problem and predicts a a clear temporal sequence of first investing into worker production only and switching to pure sexual production toward the end of the season (Macevicz and Oster 1976). Yet, the question about optimal swarm size involves a single caste only (the workers) as investment into sexuals (outgoing queens) becomes negligible. This suggests using a direct timing perspective. Thus, we do not depend on the advanced methods of dynamic optimization to identify optimal strategies but apply well-known models of population dynamics.

We need to understand what is the optimal size of a swarm released (new colony) as well as what is the optimal size of the colony left behind (old colony), e.g., without undermining its functionality (c.f. Buczkowski and Bennett 2009). As both quantities are related to the growth properties of colonies, they must, by implication, also affect the temporal frequency of swarming. Here, we will provide a minimalistic concept which may explain typical honey bee swarm characteristics as an evolutionary consequence of colony dynamics and nest founding probability. However, the model is general enough to be applicable, with modifications, to other species too (see discussion).

Fig. 1
figure 1

Effect of slope parameter b on the shape of the swarm’s (new colony) survivorship function (Eq. 3). The half-saturation constant remains fixed at \(S_c=6000\)

Our approach is based on the assumption that colony growth is, at least beyond certain sizes, self-limited, that is, worker efficiency declines or worker mortality increases due to some density-dependent effects (cf. Winston 1980; Khoury et al. 2013). Reasons for such declining efficiency could be that workers need to move longer distances to collect food due to the effect of intra-colonial foraging competition or due to constraints on the availability of (optimal) nest sites. In the most simple case, we can represent this type of growth regulation for worker number W(t) at time t by a classical logistic model

$$\begin{aligned} \frac{dW(t)}{dt} = r_0 W(t) \left( 1-\frac{W(t)}{K} \right) , \end{aligned}$$
(1)

where W(t) is the number of individuals in the colony at time t (measured in days). The daily colony growth rate is given by \(r_0\) and the saturation level K is the upper limit of the number of insects in the colony (colony size). Note that W is the number of all individuals in the colony including gynes and queens. For simplicity, we will often just talk of worker number, however, as they should provide the largest share of individuals. This differential equation has the explicit solution for initial condition \(W(0)=W_0\)

$$\begin{aligned} W(t)=\frac{K}{1+\exp (a-r_0 t)}, \end{aligned}$$
(2)

with \(a=\ln (\frac{K}{W_0}-1)\). In honey bees, the future survival of the remaining colony (nest) seems to be mostly insensitive to \(W_0\) above certain limits (about \(W_0>1000-1500\); see Rangel and Seeley 2012; Rangel et al. 2013), but the future survival and/or reproductive success of the swarm depends on the swarm size (Lee and Winston 1987; Rangel and Seeley 2012): that is, survival is an increasing function of swarm size. Existing data on swarm winter survival are somewhat contradictory and do not allow precise estimation of this relationship (e.g., Rangel et al. 2013; Smith et al. 2016; Seeley 2017). However, a sigmoid survival function s(S) as used by Rangel et al. (2013) appears plausible

$$\begin{aligned} s(S)=\frac{1}{1+\exp (-b (S-S_c))}, \end{aligned}$$
(3)

where S is the number of individuals in the swarm (including the queen), \(S_c\) is the half-saturation constant, and b is the parameter that governs the rate of change of s(S) with respect to S (Fig. 1). The function s(S) represents the probability that a swarm of size S will survive and become established as an independent colony. For simplicity, we assume a maximum probability of survival (asymptote) of one, which implicitly assumes that large swarms almost always survive. In reality, this is certainly not the case (see citations above), but different values of the asymptote would not qualitatively affect our results. With larger values of \(S_c\) and especially b, the function s(S) shares the concave (at low investment) to convex (at high investment) shape transition of the investment–fitness relationship in Smith and Fretwell (1974).

The expectation is that colonies choose swarm size S and remaining colony size \(W_0\) in such a way as to maximize the output of surviving swarms over time; maximizing that output is analogous to maximizing the net-reproductive rate, which—in stable populations—is the currency of fitness if we want to compare the success of different life-history strategies (Charlesworth 1994; Roff 2008). To find that strategy, we solve Eq. (2), for any given \(W_0\) and S and \(W(T)=W_0+S\), for the time \(T(W_0,S)\) necessary to grow the colony from size \(W_0\) to size \(W_0+S\). At that moment, a colony would release a swarm of size S with \(0<S<K\) (the general behavior of the population model assures \(W < K\) for finite time if \(W_0 < K\)). The solution can be found by simple algebraic rearrangement

$$\begin{aligned} T(W_0, S)=\frac{1}{r_0} \left( \ln \left(\frac{K}{W_0}-1\right)-\ln \left(\frac{K}{W_0+S} -1\right) \right) , \end{aligned}$$
(4)

given that \(W_0+S < K\). In the following, we will call this time between swarming events the “swarm interval”. Maximum output of surviving swarms will be achieved when the reproductive rate \(\rho\) becomes maximal

$$\begin{aligned} \rho = \frac{s(S)}{T(W_0, S)}. \end{aligned}$$
(5)

Note that the reproductive rate calculated here is not just the rate at which swarms are released but the (expected) rate at which surviving swarms, those that successfully establish a new colony, are established; it is thus not the inverse of the swarm interval. Further note that our approach does not account for possible multiplicative effects of swarms on future fitness expectation; this may be ignored if we assume a stable number of colonies at the population level.

In the following, we identify optimal strategies, i.e., optimal combinations \(\hat{W_0}, {\hat{S}}\) that maximize the swarming rate (Eq. 5). These optimal combinations will depend on parameters K and the swarm survival parameters \(S_c\) and b. Because, as we will explain below, the growth rate \(r_0\) has no effect on the optimal strategy, we fix this value to \(r_0=0.03\)/day throughout the paper, a value that results in colony dynamics similar to those observed in honey bees (see Table 1).

Table 1 Definition of model parameters and range of values tested for their impact on the optimal swarming strategy \(\hat{W_0}, {\hat{S}}\)

Results

We could not find a complete analytical solution for the joint maximization of Eq. (5) with respect to parameters K, \(S_c\) and b. However, as \(W_0\) is only included in the denominator of Eq. (5), we can find an analytical solution for the optimal \({{\hat{W}}}_0\) that minimizes \(T(W_0, S)\) and consequently maximizes Eq. (5) for any given S by solving

$$\begin{aligned} \frac{\partial T(W_0,S)}{\partial W_0} =\frac{K}{r_0} \left( \frac{1}{(S+W_0)(K-S-W_0)} - \frac{1}{W_0 (K - W_0)} \right) =0. \end{aligned}$$
(6)

This gives the root

$$\begin{aligned} {{\hat{W}}}_0 = \frac{K-S}{2} \text { and } K > S, \end{aligned}$$
(7)

which by simple algebraic manipulation can be shown to be a minimum for \(T(W_0, S)\) and hence a maximum for \(\rho\). This solution makes intuitive sense. In the logistic model, colony growth is maximal at \(W=K/2\) and symmetric around K/2. It must, consequently, be most efficient to grow from \({{\hat{W}}}_0=K/2 - S/2\) to \(K/2+S/2={{\hat{W}}}_0+S\) between fission events. It is further noteworthy that \({{\hat{W}}}_0\) directly depends only on the model parameter K. The other parameters can only indirectly affect \({{\hat{W}}}_0\) via their impact on the optimal swarm size \({{\hat{S}}}\) (see below). Most importantly, this result reduces the problem of finding the optimal strategy \((\hat{W_0}, {\hat{S}})\) to one of optimizing S alone.

Given solution (7), we can insert \({{\hat{W}}}_0 = (K-S)/2\) into Eq. (4) and find, after some rearrangement that

$$\begin{aligned} T({{\hat{W}}}_0, S)=\frac{2}{r_0} \cdot \ln \left( \frac{K+S}{K-S} \right) , \end{aligned}$$
(8)

which must monotonically increase with increasing S. Collecting terms, we find

$$\begin{aligned} \rho =\frac{r_0}{2} \cdot \frac{\frac{1}{1+\exp (-b (S-S_c))}}{\ln \left(\frac{K+S}{K-S}\right)}. \end{aligned}$$
(9)

For presentation of results, we further define the swarm fraction as the proportion of individuals that join a separating swarm

$$\begin{aligned} x=\frac{S}{W_0+S}. \end{aligned}$$
(10)

We use numerical simulations, all implemented in R version 4.2.2 (R Core Team 2022), to find the optimal strategy \({\hat{S}}, \hat{W_0}\) that maximizes the swarming rate for all combinations of \(K \in (15,000, 30,000)\) and a fine-grained spectrum for parameters \(S_c\) and b (see Table 1). As explained above, we only need to screen the space \(S \in \{1... K-1\}\) to find \({\hat{S}}\) as the corresponding optimal value \(\hat{W_0}\) is then defined by Eq. (7).

In Fig. 2, we show an example of the colony dynamics of a colony that applies the optimal strategy \({{\hat{W}}}_0\) and \({{\hat{S}}}\) for a particular set of model parameter values. Importantly, the ICF strategy \(S=1, W_0=K/2\) always provides a local maximum in the whole range of parameters tested as exemplarily shown in Fig. 3 for the same parameter set used in Fig. 2 as well as two alternative values of b. In the following, we proof that this must be so for even larger values of b than those considered here.

With \(S\rightarrow 0^+\), the nominator of Eq. (9) approaches a fixed value, whereas the denominator approaches 0, so that \(\rho\) has a vertical asymptote at \(S = 0\) with \(\rho \rightarrow +\infty\) as \(S \rightarrow 0^+\). However, the smallest possible swarm size is \(S=1\), so that demonstrating the existence of a local (or edge) maximum at \(S=1\) requires investigation of the behavior of \(\frac{\partial \rho }{\partial S} |_{S=1}\). We find [checked by Wolfram Alpha (Wolfram Research Inc 2024)]

$$\begin{aligned} \frac{\partial \rho }{\partial S}= \frac{r_0}{2} \left( \frac{b \cdot \exp (b(S_c-S))}{\exp (b(S_c-S))+1} - \frac{2K}{(K+S)(K-S) \cdot \ln \left(\frac{K+S}{K-S}\right)} \right) . \end{aligned}$$
(11)

At \(S=1\), this simplifies to

$$\begin{aligned} \frac{\partial \rho }{\partial S}= \frac{r_0}{2} \left( \frac{b \cdot \exp (b(S_c-1))}{\exp (b(S_c-1))+1} - \frac{2K}{(K^2-1) \cdot \ln \left(\frac{K+1}{K-1}\right)} \right) . \end{aligned}$$
(12)

To provide a local maximum at \(S=1\), the derivative must be negative, so that

$$\begin{aligned} \frac{b \cdot \exp (b(S_c-1))}{\exp (b(S_c-1))+1} < \frac{2K}{(K^2-1) \cdot \ln \left(\frac{K+1}{K-1}\right)} \approx 1, \end{aligned}$$
(13)

noting that both sides of the inequality take positive values and that the term on the right side very rapidly converges to 1 for \(K>1\). This is so because the Laurent series expansion of the right-hand side of inequality (13) is \(1 + \frac{2}{3 K^2} + \frac{26}{45 K^4} + O[K]^{-5}\). We can thus further simplify and rearrange to

$$\begin{aligned} b<1+\frac{1}{\exp (b(S_c-1))}. \end{aligned}$$
(14)

This tells us that for any K of relevance (K must be substantially larger 1 to build a colony of insects in the first place) and any \(S_c>1\), \(S=1\) is a local maximum for \(\rho\) if \(b \le 1\). In turn, we can conclude that \({{\hat{S}}}=1\) cannot be a global maximum if the inequality of Eq. (14) is not satisfied.

Fig. 2
figure 2

Example of colony dynamics of a colony that applies the optimal strategy, i.e., the one producing surviving swarms at the fastest rate (Eq. 9), for parameters set as follows: \(K=15,000\), \(r_0=0.03\), \(S_c=6000\), \(b=0.002\). In this example, the optimal combination is \(\hat{W_0}=3899\) (blue hatched line) and \({\hat{S}}=7202\). The colony will split when it reaches the size \(\hat{W_0}+{\hat{S}}=11,101\) indicated by the red hatched line, which in this example occurs every c. 70 days. The value K/2 is indicated by the gray line; the colony fluctuates symmetrically in size around this line

With very small values of decay parameter b in Eq. (3) or small values of \(S_c\), the optimal swarm size \({{\hat{S}}}\) may become 1, i.e., \(S=1\) becomes also the global maximum (cf. different lines in Fig. 3). Consequently, fissioning is not the best strategy and colonies should release fertilized queens one by one (see red lines in Fig. 4). That such a transition in parameter space must occur can be understood by the following reasoning: assume that, by setting \(b=0\) in Eq. (3), the survival function s(S), which is the numerator of \(\rho\), would not depend on S. The swarm interval prolonging effect of S would consequently result in a trivial fitness maximum at \({\hat{S}}=1\) in the range \(1 \le S < K\). As all functions are continuous in this model, we can expect similar result also in a certain range of parameter values \(b>0\). On the other hand, when \(b \rightarrow \infty\), the function s(S) (Eq. 3) turns into a step function with \(s(S)=0\) if \(S<S_c\) and \(s(S)=1\) if \(S>S_c\). In this case, the obvious best strategy is \({\hat{S}}=S_c+1\). Further, the sigmoid survival function s(S) is almost constant near the right (\(S=K-1\)) and left edge (\(S=1\)) of the allowable range of swarm sizes provided that parameter b is sufficiently large, i.e., \(b>0.001\) (see Fig. 1). In this case, we can conclude that the fitness function will decrease with S near \(S=1\) (see above) and \(S=K\) due to the combination of approximately constant survival and an increasing inter-swarming interval (see above). A maximum with \({\hat{S}}>1\) will consequently only exist if the increase of the (monotonic) survival function is in some range of S sufficiently steep (b is sufficiently large) to compensate for the increasing effect of S on the inter-swarming interval (denominator). The plateau like behavior of the survival curve and very low survivorship near \(S=1\) in is thus the underlying reason why the optimal swarm size \({{\hat{S}}}>1\), where it exists, must typically be much larger than one. It is in fact obvious that \({{\hat{S}}} \ge S_c\) in that case, because the slope of the survival curve is steepest at \(S=S_c\). Clearly, \({{\hat{S}}}\) may be small if \(S_c\) itself takes values near 1.

In case ICF is the best strategy, the maximum reproductive rate is achieved by setting \(W_0=\frac{K}{2}\) (precisely to \(W_0=\frac{K}{2}-0.5\)) and continuously releasing queens one by one. There may be external factors like the need to synchronize mating flights (Otis et al. 1999) or waiting for favorable climatic conditions (Pereira et al. 2010; Abou-Shaara et al. 2017) that are responsible for, e.g., pulsed releases of sexuals even in cases where the optimal reproductive strategy is ICF.

Growth rate \(r_0\) has no effect on \({{\hat{W}}}_0\) and \({{\hat{S}}}\) as it appears only as a simple multiplier in in Eqs. (4) and 8). However, a larger \(r_0\) will shorten the swarm interval in proportion to \(\frac{1}{r_0}\) and consequently increase the expected reproductive rate \(\rho\) in direct proportion to \(r_0\). Equally, colonies with larger K can produce swarms of a particular size S faster than colonies with a smaller K, so that a large K increases the reproductive rate (compare rightmost panels in Fig. 4). However, K has a negligible effect on the the location of the ICF-DCF transition in parameter space (compare red lines in the top and bottom panels of Fig. 4) and only a weak effect on the optimal swarm size \({{\hat{S}}}\) (leftmost panels in 4). We only recognize relevant differences in optimal swarm size if \(S_c\) approaches values near the smaller of the two capacities (\(K=15,000\)), and if values of b are rather small (lower right section of the colored region in Fig. 4). In this parameter section, optimal swarm size is approximately 10% larger for \(K=30,000\) than for \(K=15,000\).

Fig. 3
figure 3

Change in reproductive rate (blue lines) in relation to swarm size S for the same parameters used in Fig. 2 (\(b=0.002\); full line) and two different values for b (all other parameters equal); for better readability, we converted the reproductive rate to a seasonal output assuming a season length of 200 days. The local maxima at \(S=1\) are highlighted by dots. For \(b=0.0015\), this is also the global maximum; for the two other cases, it is not. The transition from a global maximum at \(S=1\) to one at \(S>>1\) is very sensitive to change in b; see Fig. 4 and text for more explanations. The red line shows the duration of of the swarm interval \(T(\hat{W_0}, S)\) which is not affected by b (cf. Eq. 8)

Fig. 4
figure 4

Optimal swarm size \({{\hat{S}}}\) (absolute numbers), optimal swarm fraction x (proportion of colony members joining swarm), and resulting reproductive rates \(\rho\) (per season and on logarithmic scale) with respect to swarm survival half parameter \(S_c\) and swarm survival decay parameter b. Top row shows results for maximum worker number of \(K=15,000\) and bottom row for \(K=30,000\) workers. The red lines delineate the region where \({{\hat{S}}}=1\) (ICF) is the best strategy (white) from the DCF region (color shading)

Discussion

We here provide, to our knowledge for the first time, a simple strategic model that gives answers to the questions under which conditions DCF may be a better reproductive strategy than ICF, how large colonies should grow before fissioning, and how colonies should allocate workers to the old and new colony when fissioning. Our most important findings can be summarized by three statements. (1) Colony growth rate \(r_0\) should not affect reproductive decisions and carrying capacity K has only marginal effects (both have effects on the swarm interval, however). (2) Over a wide spectrum in parameter space (\(S_c, b\)) a discontinuity in the optimal strategy emerges (Figs. 3 and 4): either split-off colonies (swarms) should be of size \(S=1\) (ICF) or they should be of considerable size \(S>>1\) (DCF). (3) Whether DCF or ICF is the better strategy should thus primarily depend on the location of the point of inflection at \(S = S_c\) and on the slope of the swarm survival function determined by b.

Our model goes beyond the approach followed by Rangel et al. (2013) in their inclusive fitness model. First, our model provides statements about (i) the absolute colony size at fissioning and (ii) the absolute size of the leaving swarm that would maximize colony fitness. In addition, it specifies (iii) the swarm interval whereas Rangel et al. only consider the optimal swarm fraction (x) as variable of interest. Implicitly, our model also provides a prediction of the swarm fraction but only as the emerging result of the two absolute values predicted, i.e., \(x=\frac{{\hat{S}}}{\hat{W_0}+{\hat{S}}}\) (see Fig. 4). Further, a reanalysis of Rangel et al.’s model (see https://doi.org/10.1007/s00040-024-00960-9) shows that the impact of intra-colonial relatedness on the optimal swarm fraction x is marginal, at least with the parameters specified by Rangel et al. The optimal swarm fraction x is dominated by the shape of the swarm survival function s(S).

According to our model, swarm size \(S=1\) provides in all scenarios a local maximum. Further, if \(S=1\) is not the global maximum, the smallest swarm size \(S_{min}\) for which holds \(\rho (W_0, S_{min})>\rho (K/2, 1)\) is in all cases substantially larger than 1 (Fig. 5). This suggests that an evolutionary transition from ICF to DCF may not be trivial even if environmental conditions change in a way favoring DCF: it may require passing a large fitness valley if the swarm survival half saturation \(S_c\) is large (see also Planas-Sitjà et al. 2023). Consequently, the starting point to such transition presumably was not selection on the reproductive strategy itself (cf. Ribbands 1953; Brian 1965). Instead, it may have been initiated by the necessity to move the entire colony due to, e.g., parasite infection of old nests or gradual resource depletion in a nest’s surrounding. Migrations of whole colonies driven by seasonal changes in flower abundance are typical for tropical honey bees (Hepburn and Radloff 2011), and nest relocation occurs frequently in ants (Gordon 1992; Pratt et al. 2002; Gibb and Hochuli 2003; Heller and Gordon 2006). Since swarming brings about a temporal release of parasites in honey bees (Seeley 2017; Kohl et al. 2023), parasite pressure might have been another factor favoring colony relocation behavior and might now select for colonies that emit swarms at a faster rate and of smaller size than would be predicted based on the considerations of our model alone. In this case, it is also noteworthy that in honey bees, the maximum survival probability of the old colony (with a new queen) is lower than that of the outgoing swarm provided that the latter is large enough (Rangel et al. 2013); this may be the most trivial reason why the old mother-queen joins the swarm and does not stay in in the old nest. However, Seeley (1978, 2017) reports higher survival of established colonies but swarm survival included also data on secondary swarms that typically have a low survivorship. Alternatively, large colonies may gain foraging efficiency (Holway and Case 2000; Stroeymeyt et al. 2017; Burns et al. 2021) or distribute risks of parasitation or predation (Debout et al. 2007; Le Breton et al. 2007; Robinson 2014) by distributing the colony over several inter-connected nests (polydomy); polydomy may also be favored if colony growth is limited by the capacity of nest sites, so that growing colonies need to spread out over several sites (Robinson 2014). “True” colony fissioning, i.e., complete independence of separated nests, may have happened rather accidentally in any of these situations as colony fragments that became isolated from others may have found ways (e.g., due to lost control by the queen) for substituting lost queens (Cronin et al. 2013).

We also want to point out that in some ants like Myrmecina graminicola (lab-raised), even a very small number of workers may already provide substantial benefits for a queen founding a new nest (Finand et al. 2024a). Other field studies also suggest that sizes of fissioning colonies may be quite small and variable (Chéron et al. 2011; Cronin et al. 2012). This implies that \(S_c\) can take small values, so that the mentioned fitness valley may not be so wide in such taxonomic groups.

Reasons that may have promoted budding may be stable spatial heterogeneity with suitable habitat being scattered and rare. Such circumstances may promote evolution of colony fissioning as such conditions implicitly increase the cost of long-distance dispersal (reducing the survival chances of singular queens). This may be one underlying reason for the shifting of urban populations of the ant Tapinoma sessile from the ICF to the DCF strategy (Blumenfeld et al. 2022). However, spatio-temporal heterogeneity may promote selection for ICF as long-distance dispersal allows colonizing empty habitats and spreading of risk (cf. Planas-Sitjà et al. 2023). Another reason promoting DCF may also be the acceleration of generation time in unstable habitats as young (split-off) colonies with new (daughter) queens may need less time before becoming reproductive themselves (Tsuji and Tsuji 1996; Blumenfeld et al. 2022). However, this benefit must be balanced with the prolongation of the swarm interval that goes along with releasing larger swarms. Our model does account for this effect (Eq. 8). We do not consider, however, the size-dependent time it takes a recently budded new colony (swarm) to itself reach the size for a first successful fission event.

Our model is quite simple and thus general—but it is certainly also limited in its realism. In particular, our model assumes a constant and aseasonal environment, whereas many social insects live in more or less seasonal environments. Seasonal changes in the environment could affect colony dynamics in several ways, e.g., by seasonally modulating the growth parameters \(r_0\) or K. However, of particular interest in the context of reproduction strategy would be seasonal changes in the colony survival function s(S) (possibly also in the survival of the old colony) that may constrain the time window in which fissioning is a sensible reproductive option (Seeley and Visscher 1985). In honey bees, the time window in which resources are sufficient to raise brood seem to be limited to the early summer (Simpson 1959), the period in which most swarms are released (e.g., Seeley and Visscher 1985; Henneken et al. 2012). Such seasonal effects may, for example, force bee colonies to release after-swarms that are typically smaller than first swarms (Winston 1980), because swarms released too late as well as the mother colony may come under time pressure to prepare for the winter season. Further, some ant species split off several new colonies at about the same time (cf. Cronin et al. 2016b), in some cases the separating colonies vary considerably in size (Chéron et al. 2011). Such observations cannot be explained by our simple model and, as such, are difficult to reason for as colonies should release a new colony as soon as the colony size is large enough. Reasons for delaying such opportunities might also be traced to seasonal effects, i.e., if new colonies only have reasonable survival chances within certain seasonal time windows, whereas releasing new colonies of variable size may signal bet-hedging strategies or intra-colonial queen-queen conflicts (Chéron et al. 2011). In honey bees, matters may further be complicated by the interactive consequences of drone production and swarming—a topic investigated by Lemanski and Fefferman (2017) in their model on the optimal timing of reproduction in honey bees. It would be possible to extend our model to account for seasonal effects but certainly at the loss of generality. We therefore suggest that such extensions should be implemented only when the model is applied to a concrete case study.

We take a colony perspective on fitness and consequently ignore any within colony conflicts over reproductive allocation or inclusive fitness effects, other than in the model approaches of Pamilo (1991); Visscher (1993); Crozier and Pamilo (1996); Rangel et al. (2013). However, such effects should not change the principal conclusions we draw here but just have quantitative effects on predictions as we also show in the Supplement. The model also ignores possible effects of kin-competition but generally, kin-competition should shift the criterion for selecting a reproductive strategy in favor of (long-distance) ICF (see, e.g., Hamilton and May 1977; Hovestadt et al. 2001; Poethke et al. 2007).

According to our model, fitness expectations of either ICF and DCF are very sensitive to the parameters of the swarm survival function. Approximate fitness identity of the two strategies would thus require a very particular coincidence of model parameters. Consequently, coexistence of ICF and DCF strategies appears very unlikely, and in bees and wasps, it is not known (Cronin et al. 2013). However, in many ants that demonstrate DCF the presence of mixed strategies has been observed, i.e., individual colonies reproduce by both, fissioning and releasing single queens (e.g., Cronin et al. 2013, 2020). Evolution of both strategies at the population level is predicted by models that account for the low dispersiveness of split-off colonies and the effects of local disturbances and habitat heterogeneity due to an existing competition–colonization trade-off (Cronin et al. 2016a; Planas-Sitjà et al. 2023; Finand et al. 2024b). Mixing strategies at the individual level, i.e., at the level of single colonies, may become especially rewarding if the localized survivorship function (or the population growth function) for swarms would itself depend on the number or frequency of swarms released. In particular in species where budding colonies can only move on foot, e.g., all ants and termites, this may easily happen due to local saturation of habitats, so that swarms released in rapid sequence may not find adequate nest sites or sufficient resources (newly founded colonies are likely competitively inferior to established ones). Combining DCF and ICF may thus be a way of creating a “mixed dispersal kernel” (Hovestadt et al. 2001; Rogers et al. 2019) that on the one hand allows focusing reproductive investment into nearby suitable (and competitively contested) habitat (via DCF) but, on the other hand, provides the chance to colonize new habitats, spread risks, and avoid competition with kin via (long-distance) ICF. Evolution of mixed strategies was not observed by Planas-Sitjà et al. (2023), but the design of their model did not allow the evolution of mixed strategies.

Above considerations should not fundamentally change the findings of this study that is focusing on optimizing the size of split-off colonies given that a size-dependence of swarm survival exists. However, the papers and effects mentioned above suggest that a splitting of investment into DCF and ICF may be a fitness enhancing strategy and that the transitions between ICF and DCF may generally be shifted in favor of ICF compared to the model’s predictions. Presumably, it is no accident that a pure DCF strategy only occurs in groups where swarms are capable of flying and the mentioned disadvantages of DCF do not, or at least much less so, apply, whereas most ants follow either a pure ICF strategy or combine DCF and ICF (Cronin et al. 2013). It may be worth to develop a spatial version of our model, e.g., by combining our approach with that by Planas-Sitjà et al. (2023), to allow for the evolution of such strategy mixing and to include a feedback of the rate at which swarms are released on, e.g., the survival half saturation \(S_c\). It may also be a question of empirical interest whether colonies of social insects, in particular ants, flexibly adjust the size (and frequency) of budding colonies and the relative investment into ICF to the actual ecological conditions, e.g., the availability of nest sites or food resources in their surrounding as well as on the status of the season.

In summary, we here provide a conceptual model of the temporal colony dynamics that may shed additional light on the ecological conditions that favor either ICF or DCF as a reproductive strategy and makes predictions on the optimal size of budding colonies. Due to its simplicity, the model should be empirically testable in particular by contrasting the foundation success of swarms of different size, including the success of singular queens (ICF).

Fig. 5
figure 5

Minimum swarm size that would result in swarming rates larger than those generated by ICF (\(S=1\)). The red lines separates the parameter range where ICF (\({\hat{S}}=1\)) is the global maximum from that where DCF is the better strategy. Note that the \(S_{min}\) jumps from \(S_{min}=1\) to \(S_{min}>>1\) at this transition line, creating a wide fitness valley