## Abstract

This article elaborates on McShea and Brandon’s idea that drift is unlike the rest of the evolutionary factors because it is constitutive rather than imposed on the evolutionary process. I show that the way they spelled out this idea renders it inadequate and is the reason why it received some (good) objections. I propose a different way in which their point could be understood, that rests on two general distinctions. The first is a distinction between the underlying mathematical apparatus used to formulate a theory and a concept proposed by that theory. With the aid of a formal reconstruction of a population genetic model, I show that drift belongs to the first category. That is, that drift is constitutive of population genetics in the same sense that multiplication is constitutive in classical mechanics, or that circle is constitutive in Ptolemaic astronomy. The second distinction is between eliminating a concept from a theory and setting its value to zero. I will show that even though drift can be set to zero just like the rest of the evolutionary factors (as others have noted in their criticism of McShea and Brandon), eliminating drift is much harder than eliminating those other factors, since it would require changing the entire mathematical apparatus of standard population genetic theory. I conclude by drawing some other implications from the proposed formal reconstruction.

This is a preview of subscription content, access via your institution.

## Notes

There is quite a large literature discussing the adequacy of the analogy in various different aspects (see Sect. 5 for some examples). In this article I focus on only one of those aspects, which is presented in what follows.

For example, as I show below, because considering it allows us to get a better understanding of how certain models within PG are put together, and how they explain.

For more on their definition of constitutive and imposed constraints, see Brandon and McShea (2012).

Again, this is for clarity. The axioms could be established in an entirely formal manner.

Alternative presentations, for example, calculate the probability of every possible type of mating in the population, together with the sampling probabilities for the descendants of each of those types. Generational transitions are then modeled with recourse to more than two sampling processes. There are other formal reconstructions that focus on these kind of presentation; see for example Lloyd (1994) and Lorenzano (2014).

Following usual talk, I call the type of a gene an allele-type (for example, when one says that “gene

*g*_{1}is of allele-type*A*, while gene*g*_{2}is of allele-type*a*”), and the type of a pair of genes (of an individual) a genotype, even though "genotype" would have been a better fitting name to the type of a gene. The term "genotype" is also used ambiguously in the literature to refer to a particular pair of genes (not to its type); here, I use the term "genotype" to refer exclusively to the former not the latter.I also assume, for simplicity, that fitness coefficients remain constant over the generations, though this could be easily modified in a more complex version of the reconstruction.

In a sampling process without replacement, already chosen individuals from a population to form part of the sample cannot be chosen again. In the typical examples, where one samples marbles from an urn, marbles that have already been sampled are not put back inside the urn. In contrast, in a sampling process

*with replacement*, marbles are put back into the urn after being sampled and can be chosen again. Obviously, when one speaks of biological sampling processes there is no one*choosing*the individuals from the samples; for instance, in parental sampling, the sample consists simply of the individuals who survived.Probability assignments for sampling processes with and without replacement are different; however, it can be mathematically proven that, when the samples are large, the probabilities of the second approximate (and are equal, at the limit) to the first (see Feller 1971, Chapters II, VI).

Notice that, for this second process, it is not the case that

*I*_{i+1}⊆*I*_{i}* (the "sample" is not a subset of the original population), because that would violate Axiom 1. In fact (because of Axiom 1)*I*_{i+1}and*I*_{i}* do not share any elements. However, the genotype distribution of*I*_{i+1}depends on that found in*I*_{i}* exactly in the same way as if it was a sample (a subset) properly speaking—see the constraints listed below.Fitness coefficients do not play a role here because, as said before, I am only considering selection by viability. In a more complete version, fitnesses should appear.

I assume that migration plays a role especially in the first sampling process, and mutation in the second.

Logicians (and to a lesser extent, mathematicians) usually do specify the language in which their theories are built, while empirical scientists typically do not do this. However, formal reconstructions of empirical theories, done usually by philosophers, do make the language explicit. For example, for a formal reconstruction of CM that makes explicit all the terms used in its language, see Balzer et al. (1987).

Of course, there is also the issue of whether the theory itself remains the same if one of its concepts is eliminated. I will not go into this problem here.

Moreover, the concept of drift was originally introduced into PG to account for certain empirical phenomena that could not be explained purely by selection (e.g. Gulick 1872, discusses the puzzling geographical distribution of the genera within a family of land snails in the Hawaiian islands, which are phenotypically very distinct, but live in environments that are very similar to one another; see also Hagedoorn and Hagedoorn 1921; Brooks 1899, for other antecedents). None of this would make much sense if drift was just a purely mathematical phenomenon.

It might be thought that my conclusions regarding drift as part of the background mathematical vocabulary lend some credibility to the statisticalist position, defended chiefly by Walsh, Ariew, Lewens and Matthen (I thank an anonymous reviewer for this suggestion). I have some reservations about this, but for reasons of space I cannot explore this issue here, since going into it would require introducing that debate more fully. Instead, leave it open as a suggestion.

## References

Balzer, W., Moulines, C. U., & Sneed, J. D. (1987).

*An architectonic for science: The structuralist program*. Dordrecht: Reidel.Baravalle, L., & Vecchi, D. (forthcoming). Drift as a force of evolution: A manipulationist account. In

*Life and evolution*. Berlin: Springer.Beatty, J. (1984). Chance and natural selection.

*Philosophy of Science,**51*(2), 183–211.Brandon, R. (2005). The difference between selection and drift: A reply to Millstein.

*Biology and Philosophy,**20*(1), 153–170.Brandon, R. (2006). The principle of drift: Biology’s first law.

*The Journal of Philosophy,**103*(7), 319–335.Brandon, R. N., & McShea, D. W. (2012). Four solutions for four puzzles.

*Biology and Philosophy,**27*(5), 737–744.Brooks, W. K. (1899).

*The foundations of zoology*. Oxford: Macmillan Co.Carnap, R. (1950).

*Logical foundations of probability*. Chicago: University of Chicago Press.Crow, J. F., & Kimura, M. (1970).

*An introduction to population genetics theory*. New York: Burgess Pub. Co.Earnshaw, E. (2015). Evolutionary forces and the Hardy–Weinberg equilibrium.

*Biology and Philosophy,**30*(3), 423–437.Feller, W. (1971).

*An introduction to probability theory and its applications*. New York: Wiley.Gillespie, J. H. (2004).

*Population genetics: A concise guide*. New York: JHU Press.Gulick, J. T. (1872). On the variation of species as related to their geographical distribution, illustrated by the achatinellinæ.

*Nature,**6,*222–224.Hagedoorn, A. L., & Hagedoorn, A. C. (1921).

*The relative value of the processes causing evolution*. The Hague: Nijhoff.Hartl, D. L., & Clark, A. G. (2007).

*Principles of population genetics*. New York: Sinauer Associates.Hitchcock, C., & Velasco, J. D. (2014). Evolutionary and Newtonian forces.

*Ergo, an Open Access Journal of Philosophy,**1*(2), 39–77.Lewens, T. (2010). The natures of selection.

*The British Journal for the Philosophy of Science,**61*(2), 313–333.Lloyd, E. A. (1994).

*The structure and confirmation of evolutionary theory*. Princeton: Princeton University Press.Lorenzano, P. (2014). What is the status of the Hardy–Weinberg law within population genetics? In M. Galavotti, E. Nemeth, & F. Stadler (Eds.),

*European philosophy of science—Philosophy of science in Europe and the Viennese heritage*(Vol. 17, pp. 159–172). Berlin: Springer.Luque, V. J. (2016a). The principle of stasis: Why drift is not a zero-cause law.

*Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences,**57,*71–79.Luque, V. J. (2016b). Drift and evolutionary forces.

*THEORIA*.*An International Journal for Theory, History and Foundations of Science,**31*(3), 397.Matthen, M., & Ariew, A. (2002). Two ways of thinking about fitness and natural selection.

*Journal of Philosophy,**99*(2), 55–83.McShea, D., & Brandon, R. (2010).

*Biology’s first law*. Chicago: The University of Chicago Press.McShea, D. W., Wang, S. C., & Brandon, R. N. (2019). A quantitative formulation of biology’s first law.

*Evolution,**73*(6), 1101–1115.Millstein, R. L. (2002). Are random drift and natural selection conceptually distinct?

*Biology and Philosophy,**17*(1), 33–53.Millstein, R. L., Skipper, R. A., & Dietrich, M. R. (2009). (Mis)Interpreting mathematical models: Drift as a physical process.

*Philosophy, Theory, and Practice in Biology,**31*(4), 459–482.Otsuka, J. (2016). A critical review of the statisticalist debate.

*Biology and Philosophy,**31*(4), 459–482.Pence, C. H. (2017). Is genetic drift a force?

*Synthese,**193*(6), 1967–1988.Reisman, K., & Forber, P. (2005). Manipulation and the causes of evolution.

*Philosophy of Science,**72*(5), 1113–1123.Roffé, A. J. (2017). Genetic drift as a directional factor: Biasing effects and a priori predictions.

*Biology and Philosophy,**32*(4), 535–558.Shapiro, L. A., & Sober, E. (2007). Epiphenomenalism—The Do’s and the Don “Ts”. In G. Wolters & P. K. Machamer (Eds.),

*Thinking about causes: From Greek philosophy to modern physics*(pp. 235–264). Pittsburgh: University of Pittsburgh Press.Sober, E. (1984).

*The nature of selection: Evolutionary theory in philosophical focus*. Chicago: University of Chicago Press.Stephens, C. (2004). Selection, drift, and the “forces” of evolution.

*Philosophy of Science,**71,*550–570.Stephens, C. (2010). Forces and causes in evolutionary theory.

*Philosophy of Science,**77*(5), 716–727.Wakeley, J. (2005). The limits of theoretical population genetics.

*Genetics,**169*(1), 1–7.Walsh, D. M., Ariew, A., & Lewens, T. (2002). The trials of life: Natural selection and random drift.

*Philosophy of Science,**69*(3), 452–473.Williams, M. B. (1970). Deducing the consequences of evolution: A mathematical model.

*Journal of Theoretical Biology,**29*(3), 343–385.

## Acknowledgements

This work has been funded by the research projects SAI 827-223/19 and PUNQ 1401/15 (National University of Quilmes, Argentina), UNTREF 32/15 255 (Universidad Tres de Febrero, Argentina) and UBACyT 20020170200106BA (Universidad de Buenos Aires, Argentina).

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendix: Proofs of theorems

### Appendix: Proofs of theorems

###
**Theorem A1**

*Let R be the relation* \(\left\{ {\left\langle {x,\left\{ {x,y} \right\}} \right\rangle /x \in G_{i} \,\& \,\left\{ {x,y} \right\} \in I_{i} } \right\}\). *Note that R is a function*, *since Axiom* 3 *specifies that R satisfies the uniqueness and existence requisites for functions* (*each gene has one and only one corresponding individual to which it belongs*). *Thusly*, *R can be seen as a G*_{i} *→* *I*_{i} *function*. *Additionally*, *Axiom* 2 *implies that each gene is present only once inside each individual* (*in the g*_{j} *≠* *g*_{k} *part*). *All of these tells us that*, *for each gene*, *there exists one and only one individual to which it belongs*, *and which also contains a different gene*. *In that way*, *every individual from generation i will be the value assigned by function R to two different genes* (*arguments*) *from that generation*. *Therefore*, *if* |*G*_{i}| *is the number of genes in generation i*, *function R establishes* |*G*_{i}|/2 *partitions in its domain G*_{i}. *Notice also that R is suryective* (*every element of the codomain*, *in this case I*_{i}, *is the value of some argument*), *since*, *by Axiom* 3, *no gene can be “loose”*. *Therefore*, *the number of partitions also coincides with the number of individuals*. *Therefore*, |*I*_{i}| = |*G*_{i}|/2, *which immediately gives us the desired result*.

###
**Theorem A2**

*For simplicity’s sake*, *I only prove this for finite populations*, *and for allele type A* (*the proof for a is almost identical*). *By* *applying* *the* *definitions* *of* *FreqAT* *and FreqGT*, *what needs to be proved is that*:

*It shall be convenient to call G*_{i}(*x*) *the set* \(\left\{ {g_{k} \in G_{i} /f_{1} \left( {g_{k} } \right) \, = x} \right\}\) (*i*.*e*. *the set* *of* *genes* *of* *type* *x* *present* *in generation i*), *and I*_{i}({*x*, *y*}) *the set* \(\left\{ {i_{k} \in I_{j} /f_{2} \left( {i_{k} } \right) \, = \{ x,y\} } \right\}\) (*the set* *of* *individuals* *of* *genotype* {*x*, *y*} *in generation i*). *With this terminology*, *what needs to be proved is that*:

*Theorem* A1 *establishes that* |*G*_{i}| = 2 *×* |*I*_{i}|. *Thus*, *for the previous equality to hold*, *what needs to happen is that*:

*To prove this*, *I define two new sets*, *called G*_{i}(*A*, {*A*, *A*}) *and G*_{i}(*A*, {*A*, *a*}). *These* *sets* *will* *represent* *the set* *of* *A* *genes* *present* *in* *an* {*A*, *A*} *kind of individual*, *and in an* {*A*, *a*} *kind of individual*. *Formally*,

*Axioms* 3 *and* 4 *imply that* \(G_{i} (A,\{ A,A\} ) \cap G_{i} (A,\{ A,a\} ) = \varnothing\) *and that* \(G_{i} (A,\{ A,A\} ) \cup G_{i} (A,\{ A,a\} ) = G_{i} (A)\) (*the first follows from the fact that no gene is inside more than one individual*, *while the second from the fact that every gene is inside at least one individual*, *along with another gene—which must be either of the same or different type*). *Both* *of* *these* *facts* *imply that* \(\left| {G_{i} (A,\{ A,A\} )} \right| + \left| {G_{i} (A,\{ A,a\} )} \right| = \left| {G_{i} (A)} \right|\).

*The following two facts*, *along with the equation just derived*, *directly give us the desired result*.

- 1.
\(2\left| {I_{i} \left( {\{ A,A\} } \right)} \right| = \left| {G_{i} (A,\{ A,A\} )} \right|\)

*This follows from the fact that the members of I*_{i}({*A*,*A*})*are pairs of different members of G*_{i}(*A*,{*A*,*A*}),*since every member of G*_{i}(*A*,{*A*,*A*})*belongs to an AA individual*,*along with another G*_{i}(*A*,{*A*,*A*})*member*. - 2.
\(\left| {I_{i} \left( {\{ A,a\} } \right)} \right| = \left| {G_{i} (A,\{ A,a\} )} \right|\)

*This follows from the fact that every member of G*_{i}(*A*,{*A*, *a*}) *belongs to a* (*different*) *Aa individual* (*i*.*e*. *member of I*_{i}({*A*, *a*})). *This individual also contains a member not belonging to G*_{i}(*A*,{*A*, *a*}, *but to G*_{i}(*a*,{*A*, *a*}). *Thus*, *a bijection can be established between both sets*, *which means that they have the same number of elements*.

###
**Theorem A3**

*This theorem follows from the following two facts*. *First*, *that if*:

*is a multinomial distribution* (*with p*, *q*, *r being the frequencies of three kinds of objects in the population*), *then the expected value of the frequencies in the sample is exactly p*: *q*: *r* (*i*.*e*. *that all kinds of objects maintain their original frequency*). *Note that the probability assignment for the first sampling process is multinomially distributed*.

*The second fact is the probabilistic “Law of large numbers”*. *This law states* (*with terminology modified to fit the one I am using*) *that*:

*where n is the size of a sample in a probabilistic sampling process Sampl*(*x*), \(\hat{p}\) *is the frequency of one kind of object in the sample*, \(E_{{\hat{p}}} \left( {Sampl\left( x \right)} \right)\) *is the expected value of* \(\hat{p}\) *in Sampl*(*x*), *and ϵ is an arbitrary number* (*thus*, *an arbitrarily small one*). *In other words*, *at infinite sample sizes*, *the probability that deviations from the expected value occur* (*for the frequency of one kind of object*) *reduce to zero* (*for more on this law*, *see* Feller 1971, *chapter X*). *If this is applied to the first sampling process*, *the result is that actual outcomes should equal their expected outcomes*. *These two facts together directly imply the desired result*.

###
**Theorem A4**

*The proof of this result uses the same two facts as the proof from above*. *Since* |*I*_{i}*| *is infinite*, *by Theorem* A3, *we get that for every genotype* {*x*, *y*}:

*Now*, *since we assume that all fitnesses are equal*, *then there exists a number m*, *such that m* = *w*_{A,A} = *w*_{A,a} = *w*_{a,a}. *Thus* (*by definition of* \(\bar{w}(i)\)):

*Since the m’s cancel out in the numerator and denominator*, *and what is left in the denominator* (*the sum of all the genotype frequencies*) *equals* 1, *all of this equals FreqGT*(*I*_{i}, {*x*, *y*}). *Thus*, *after the first sampling process*, *the frequencies of the genotypes remain identical*. *Additionally*, *Theorem* A2 *implies that allele-type frequencies will also be identical to the originals* (*since they can be calculated from genotype frequencies*, *which are themselves identical to the originals*). *Thus*, *for every allele-type x*, *FreqAT*(\(\cup I_{i} *\), *x*) = *FreqAT*(*G*_{i}, *x*) (*the* *union* *set* *of* *a* *set* *of* *individuals* *equals* *the set* *of* *genes* *present* *in* *those* *individuals*).

*By using the same two facts as in the demonstration of Theorem* A3, *and the probability distribution of the second sampling process*, *we get the desired result*.

## Rights and permissions

## About this article

### Cite this article

Roffé, A.J. Drift as constitutive: conclusions from a formal reconstruction of population genetics.
*HPLS* **41**, 55 (2019). https://doi.org/10.1007/s40656-019-0294-6

Received:

Accepted:

Published:

DOI: https://doi.org/10.1007/s40656-019-0294-6

### Keywords

- Population genetics
- Drift
- Formal reconstruction
- Constitutivity
- Classical mechanics
- Analogy