# A flexible theoretical representation for the temporal dynamics of structured populations as paths on polytope complexes

## Abstract

We present a new theoretical framework to represent the dynamics of structured populations through time and across geographic space. We show (i) that the mechanisms by which populations evolve lead to combinatorial structures, and (ii) that measures of gene flow and geographical structure lead to linear systems. These characteristics determine two polytope complexes that encodes all feasible migration scenarios. Analysis of these polytope complexes demonstrates how systems of structured populations can be classified consistently, and how population histories can be represented as paths on a concrete mathematical space, which in turn promises to simplify the search space required for reconstructing past migration processes from population genetic data.

## Keywords

Migration Population structure Population genetics Polytope complex Circle patterns## Mathematics Subject Classification

92D25 52B99 52C26## 1 Introduction

Many demographic factors, including movements of both individuals and entire populations, can alter the nature of linkages between populations over time. Representing the diverse forms of these temporal dynamics requires a flexible theoretical framework that is consistent with longer-term goals of reconstructing the history of structured populations (“metapopulations”) from population genetic data. We show that the key characteristics of these migration dynamics—the intensity of migration, pathways of migration and the spatial distribution of populations—give rise to various mathematical structures that allow us to view populations and migration between them as geometric objects, and their dynamics as pathways on a polytope complex \(\mathfrak {K}\). Here, we construct such a framework and propose it as a powerful mathematical foundation for studying complex metapopulation dynamics.

To illustrate the basic ideas that motivate our work, consider the ‘Out of Africa’ model, in which anatomically modern humans originated in a small region of Africa, leaving \(\sim \)50,000 years ago to begin the settlement of Eurasia. On reaching the Middle East, the population divided, one group moving towards Europe, while the other continued on to Asia. Subsequent splitting (and joining) events produced the distribution of human populations observed today.

A model of this type can be stated more formally. If we denote the number of populations at time \(t\) by \(N(t)\), then initially \(N(t_0)=1\). At the time of the first split \(t_2\), \(N(t_2)=2\); at \(t_3, N(t_3)=3\); and so forth. Note that these three populations are necessarily all located on the circumference of a hypothetical sphere—here, an individual can only migrate to its immediate neighbors, although the probability of migrating to any given neighbor might be large or small. As the number of populations increases to four, the possible geometric patterns produced are no longer unique. As shown by geometric-combinatorial arguments, the relative locations of the populations can instead adopt a range of pattern settings (represented in Fig. 1).

Considering the theoretical mathematical structures that arise as part of this theory is beneficial for a number of reasons: (i) it allows the relationships between alternative migration scenarios to be quantified; (ii) it can be used to test the results of current models and software that describe migration among populations; and (iii) it could ultimately be employed in a statistical inference setting to reconstruct likely migration histories based on genetic information. The theory allows the number of populations to change through time, and it facilitates comparison between different metapopulation systems. However, we will begin by considering the simplest state: a fixed number of populations, each of which is represented by a point, with migration indicated by edges. Weights assigned to these edges represent a measure of both population and individual migration. The graph determined by these points and edges is also defined to capture spatial factors, such as the physical distance between populations and the possible presence of any intervening geographical barriers, such as mountain ranges or water crossings.

The fact that populations interact on a planet—a three-dimensional sphere—might be obvious, but is worth emphasizing because it implies certain constraints on population structure. For instance, when we consider migration among a fixed number of populations, the interacting populations can only form a limited number of graph structures determined by a sphere with \(n\) marked points. In addition, if a group of populations has limited mobility over geographical space (i.e., movement is not arbitrarily free), a simple but important observation is that every migration path from a given location \(L\) to a chosen destination \(D\) should cross the boundary of a region containing all points that are closer to \(L\) than to any other population. In other words, points \(L\) and \(D\) comprise the cells of a Voronoi tessellation. The following paragraph defines such a Voronoi cell, which forms a fundamental basis of our migration theory.

### **Definition 1**

Given a set of points \(P\) on a surface \(S\) with metric \(d\), the Voronoi cell associated with \(i \in S\) is the set of all points \(j\) in \(S\) that satisfy \(d(i,j) \le d(i',j)\) for all \(i' \in P\). The Voronoi diagram \(G\) associated with \(P\) is the set of all Voronoi cell boundaries. Further, we will say that \(P\) and \(d\) determine a Voronoi cell decomposition of \(S\).

Observe that even though the locations of a set of points are needed to compute a Voronoi diagram, this is not overly restrictive. First, populations are usually restricted geographically to well defined regions, with these regions typically well separated. For most biological species, it is sensible to define some form of population grouping, to which a geographical center can then be assigned. Second, our model only uses the center of a population as a way to define boundaries between regions. The individuals themselves could be dispersed within this region. Our model holds as long as the population has some form of geographical center (regardless whether individuals actually live at that point).

In the case of a geographically static set of populations (i.e., the populations themselves do not move), each population belongs to a Voronoi cell whose shape is invariant through time. However, even in the special (and biologically unrealistic) case of geographical stasis, the chosen distance measure (“migration”) linking neighboring cells may potentially change through the movement of individuals between populations. In another special case, populations can change their geographical location, while maintaining the same level of migration between them (e.g., as is the case for seasonal nomadic populations). In more general settings, population movements and individual migration are interrelated, and the Voronoi cell tessellation must represent both characteristics jointly. With this general framework in mind, we define the concept of a migration pattern in the following section.

Voronoi diagrams and related theory have been applied to a wide range of scientific problems (see Okabe et al. 2000 for a survey of applications). Their use in biology has an especially long history. For instance, Voronoi diagrams have been used to analyze the geometric structure of biological molecules, to cluster biological data, and to estimate the volume and surface of interphase chromosomes (Lee and Richard 1971; Ban et al. 2004; Kim 2004). However, we are not aware of prior applications to migration theory or analysis, especially using combinatorics.

### 1.1 The space of migration patterns

We define migration patterns as being the same if their corresponding graphs are equivalent under homeomorphism^{1} and their corresponding edges^{2} have the same weights. However, as we specifically consider migration on a sphere, we include a set of numbers \(\varTheta =\{\theta _i\}_{i \in I}\) in our definition, where each \(\theta _i\) can be considered an intersection angle between two circles that belong to a family of circles uniquely associated with the migration pattern graph. \(\varTheta \) captures the geographical dimension of a migration pattern, while allowing migration patterns to be identified uniquely among different metapopulations, which may be widely separated geographically, but possess the same underlying geographical structure. In addition, we also include a set of weights \(W\) in our definition, which represents a measure of gene flow between populations,^{3} and a set \(T\), which measures the total flow crossing each population.

### **Definition 2**

The migration pattern of a set \(P=\{P_1,P_2,\ldots ,P_n\}\) of \(n\) populations is a 4-tuple \(M=(G,\varTheta ,W,T)\), where \(G\) is the Voronoi diagram on the two dimensional sphere \(S^2\) determined by the set of geographical coordinates of \(P\); \(W=\{w_e\}_{e \in E(G)}\) is a set of positive numbers such that \(0 < w_e < 1\) and the sum of weights \(w_e\) on the face of \(G\) corresponding to \(P_i\) is \(T_i\), where \(T=\{T_1,T_2,\ldots ,T_n\}\); and \(\varTheta =\{\theta _e\}_{e \in E(G)}\) is a set of positive numbers.

We call \(T\) the set of total loads of \(M\) (or its vertex weights). For a set of populations in geographical space, it can be shown that a collection of circles exists that contains, for all populations in their boundary, the set of weights \(\varTheta =\{\theta _i\}_{i \in I}\) corresponding to the set of interior angles associated with this collection of circles. More details follow on this point in Sect. 2.2.

We declare two migration patterns to be identical if their \(W\) and \(\varTheta \) weighted graphs are equivalent. Further, we call the collection of all migration patterns, for a system of \(n\) populations with total weight \(T\), the space of migration patterns \(MS(n,T)\). The migration graph of \(M\) will be called \(G\). We assume that the ends of every edge of a migration graph are different, which implies that a migration graph has at least two vertices (i.e., we do not consider the theoretical, but biologically unrealistic, case of a population that is completely enclosed by another population). Finally, the collection of all migration graphs of \(MS(n,T)\) will be called the space of combinatorial structures of \(MS(n,T)\), denoted by \(CombMS(n)\).^{4}

Note that our model could be modified to use \(\varTheta \) as weights on a directed graph, dual of the Voronoi diagram with centers being the populations under study. We opt not to do so because, in practice, computing the direction of migration is difficult (Hey 2010), while nondirectional measures like \(F_{ST}\) are more amenable to calculation (Cox and Hammer 2010). We prefer to model non-directed graphs to provide flexibility of choice around measures of migration. Nevertheless, implementing a directed graph structure is a natural extension of this research, and as directional measures of migration improve, a directed graph structure will become an increasingly worthwhile pursuit.

### 1.2 Paper outline

In the introduction, we defined the space of migration patterns \(MS(n,T)\) as a natural setting to study the movement of, and individual gene flow between, a fixed number of populations that interact across geographical space. In the following sections, we will explore the rich mathematical nature of \(MS(n,T)\) by studying its combinatorial and graph structure. In the subsection on linear systems and polytopes, we will show that two linear systems of equations/inequalities can be associated with each migration graph, and indicate how our approach leads to two Euclidean polytope complexes that encode all of the information provided by the migration pattern. We will then show how the evolution of a metapopulation system through time can be viewed as a path on a specific polytope complex when the number of populations is fixed, and later extend this finding to more general cases where populations can split and merge. We then provide a real world example showing how this mathematical framework can be applied. The paper concludes with a general discussion of the theory. Finally, we provide an appendix with mathematical proofs required to support all of the new mathematical ideas introduced in this work.

## 2 Methods

### 2.1 The combinatorics of migration patterns

In this subsection, we consider only the space of combinatorial structures \(CombMS(n)\) for a fixed number of populations \(n\). Hence, we assume that the number of cells in the Voronoi cell decomposition of the sphere is also \(n\). The number of edges may vary, but the degree of the vertex—the number of edges incident to it—cannot be less than three.

*contraction move*, a transformation that when applied to an edge \(e\) of a migration graph, continuously reduces its length to zero, in such a way that all other edges are topologically preserved (Fig. 3). A contraction move on an edge produces a new graph whose number of edges is reduced by one.

*expansion move*. It creates a new edge \(e'\) by splitting a vertex \(P\), of valence greater than 3, into two vertices \(P_1\) and \(P_2\) such that \(e\) connects \(P_1\) and \(P_2\), while all edges formerly incident to \(P\) remain incident to either \(P_1\) or \(P_2\). Observe that the valences of the vertices \(P\), \(P_1\) and \(P_2\) are related by:

*Whitehead move*. We illustrate a Whitehead move on edge \(e\) of graph \(G_1\) in Fig. 4. It is convenient to imagine edge \(e\) as being continuously contracted until it collapses to a single point, as represented in the central graph. The process continues by expanding that point into a new vertical edge, also denoted by \(e\), to produce graph \(G_2\). In practice, a Whitehead move changes \(G_1\) to \(G_2\) in a single step. Whitehead moves have been used to describe the combinatorics of moduli spaces (see Amaris 2007; Kasra and Tao 2013). They have a central role in our theory because Whitehead moves connect all cubic migration patterns in \(CombMS(n)\), as described in Proposition 1 (see below). This in turn simplifies the analysis and description of general migration patterns since they can be viewed as being imbedded in a cubic migration pattern.

### **Proposition 1**

A proof for this proposition is given in the Appendix.

Using the edge transformations listed above, we can now describe \(CombMS(n)\) as follows:

### **Proposition 2**

All migration graphs in \(CombMS(n)\) are connected by a sequence of contraction moves to a cubic graph. More exactly, for every graph \(G\) in \(CombMS(n)\), there exists a graph \(\widehat{G}\) in \(CombMS(n)_0\), a unique number \(k\) and a sequence of contraction moves \(c_1,c_2, \ldots , c_k\) such that

\(G=c_k \circ c_{k-1} \circ \ldots \circ c_{1}(\widehat{G})\)

### *Proof*

This argument can be proved by showing that any cubic graph embedded in a two-dimensional sphere with \(n > 4\) faces can be obtained from the circular wheel graph \(CL_n\) (Fig. 5) by Whitehead moves, where \(CL_n\) is constructed by taking two concentric copies of a regular polygon of \(m=n-2\); for instance, with vertices \(P_1,P_2,\ldots , P_m\) and \(P'_1,P'_2,\ldots , P'_m\) and adding all edges of the form \(P_iP'_i\), \( i \in \{1,2,\ldots ,m\}\) (see Fig. 5 for the case of \(n=10\)).

As a consequence of this proposition, we can prove that all graphs in \(CombMS(n)\) are connected by contraction or expansion moves.

### **Proposition 3**

For every pair of graphs \((G_1,G_2)\) in \(CombMS(n)\), a sequence of contraction or expansion moves \(m_1,m_2, \ldots , m_k\) exists such that

\(G_2=m_k \circ m_{k-1} \circ \cdots \circ m_{1}(\widehat{G_1})\)

Proposition 1 can be viewed as a description of \(CombMS(n)_0\), the upper layer of \(CombMS(n)\). This in turn can be considered as a generating set for \(CombMS(n)\), in the sense that all graphs in \(CombMS(n)\) can be obtained from \(CombMS(n)_0\) by contraction moves. In this proposition, the graph \(\widehat{G}\) is not unique. However, the uniqueness of \(k\) allows us to define deeper layers of the structure of \(CombMS(n)\). With this in mind, we define the depth of a migration graph \(G\) as the number of contraction moves needed to obtain \(G\) from a cubic graph. Further, we define the *layer* or *strata* of \(CombMS(n)\), \(CombMS(n)_k\), as the collection of all migration patterns at the same depth \(k\). At the deepest level of \(CombMS(n)\), there is a single graph \(C_n\) with exactly two vertices and \(n\) edges. This graph arises naturally when considering a set of \(n\) populations that lie on a spherical geodesic (with respect to the great circle metric), but as noted above, this structure also has purely mathematical utility. Figure 5 shows the example \(C_{10}\).

Note that Euler’s characteristic formula (Eq. 1) implies that every migration graph in \(CombMS(n)_0\) has \(n\) faces, \(2n-4\) vertices and \(3n-6\) edges. Hence, \(k=2n-5\) contraction moves are needed to reach the deepest layer of \(CombMS(n)\). The number of vertices and edges at each layer of \(CombMS(n)\) can easily be computed, which implies the following proposition:

### **Proposition 4**

The propositions above provide several alternative perspectives of \(CombMS(n)\). An additional perspective is to view \(CombMS(n)\) as being generated from the graph \(C_n\), defined above, via transformations provided by expansion or contraction moves. The application of single expansion moves to \(C_n\) will produce a second generation of graphs at a higher level, which can in turn be used as seeds for a third generation of graphs. As new generations are expanded, not all graphs in \(CombMS(n)\) are necessarily obtained (see Amaris 2007, page 62). Indeed, a combination of expansion and contraction moves are required to move through the whole combinatorial space. In other words, graphs exist in \(CombMS(n)\) that cannot be obtained solely by a sequence of expansion moves from \(C_n\). This third perspective on \(CombMS(n)\), which is a consequence of Proposition 2, is summarized in the following proposition:

### **Proposition 5**

For every graph \(G\) in \(CombMS(n)\), a sequence of expansion and/or contraction moves \(m_1,m_2, \ldots , m_k\) exists such that

\(G=m_k \circ m_{k-1} \circ \cdots \circ m_{1}(C_n)\)

Jointly, the perspectives given by the propositions above can be used to obtain an explicit description of \(CombMS(n)\). For example, with just two populations, \(CombMS(2)\) comprises a single migration graph: a simple loop. In the three population case, \(CombMS(3)\) contains only one graph with two vertices and three edges. In the four population case, \(CombMS(4)\) has four possible migration graphs: two cubic graphs, one graph with three vertices, and one graph with two vertices—i.e., \(C_4\) (see Fig. 1 for a graphical representation). To explicitly enumerate graphs, it is important to remember that two equivalent graphs in \(CombMS(n)\) are considered identical.

### 2.2 Graph duality

Graph duality allows us to move to and from equivalent graph representations of the same phenomenon—in our case, a migration pattern. Each view emphasizes different aspects of the system. For instance, we can determine which population is closest to an individual just by determining the Voronoi cell in which that individual is located. By using its dual graph, we can potentially identify whether an individual came from a given population by direct migration simply by considering whether vertices representing the source and sink populations are connected by an edge, assuming the sink and source populations are known. This concept of graph duality can easily be extended to the concept of duality for the entire cell decomposition of the sphere:

### **Definition 3**

Given a graph \(G\), we denote the dual graph by \(G^{\circ }\). If \(G\) is the Voronoi diagram determined by a set of points \(P\), then \(G^{\circ }\) is the Delaunay graph determined by \(P\). Similarly, the dual of a cell decomposition of the sphere \(D\) is denoted by \(D^{\circ }\).

### **Theorem 1**

(Rivin’s theorem) Let \(\varSigma \) be a strongly regular cell decomposition of the sphere and let an angle \(\theta _e\) with \(0<\theta _e\) given for every edge of \(\varSigma \). Let \(\varSigma ^*\) be the dual decomposition of \(\varSigma \), and for each edge \(e\) of \(\varSigma \), denote the dual edge of \(\varSigma ^*\) by \(e^*\).

- 1.If some edges \(e^*_1,\ldots , e^*_n\) form a boundary face of \(\varSigma ^*\), then$$\begin{aligned} \varSigma \theta _{e_j}=2\pi \end{aligned}$$
- 2.If some edges \(e^*_1,\ldots , e^*_n\) form a closed path of \(\varSigma ^*\), which is not the boundary of a face, then$$\begin{aligned} \varSigma \theta _{e_j}>2\pi \end{aligned}$$

Note that Rivin’s theorem gives necessary and sufficient conditions for the existence of a Delaunay pattern with inner circle intersection angles provided by \(\varTheta \). If a migration pattern is known, \(\varTheta \) should satisfy conditions 1 and 2 above. Conversely, if a graph is drawn on a sphere with well defined cells (‘regions’) and a positive number is chosen for each of its edges, the corresponding dual graph is a migration pattern for the chosen weights if the conditions above are satisfied. Then, as we shall see in the following section, Rivin’s theorem and the combinatorial knowledge of migration patterns allows us to systematically generate all circle patterns. We are only concerned with Voronoi cell decompositions of the sphere that are strongly regular (for more details, see Springborn 2003). The conditions on the weights \(w_e\) given in the definition of migration patterns above are independent of Rivin’s theorem. They instead arise as a mechanism to let us model migration among populations, but the specific computation of these weights and their precise interpretation must be specified in any given application. In particular, we favor commonly used indirect measures of migration (Beerli 1998), among which can be classified: (i) simple estimators based on allele frequencies (Michalakis and Excoffier 1996; ii) maximum likelihood estimators based on allele frequencies (Rannala and Hartigan 1996); and (iii) estimators based on genealogies of the sample (Wakeley 1998). We do not know whether the different measures \(w_e\) and \(w'_e\) should be related, but this seems unlikely at the level of generality that we are considering here.

### 2.3 Linear systems for metapopulation analysis

For a system of populations (a ‘metapopulation’), a specific migration pattern can be determined whenever information is known about the location of populations, as well as migration between them, at a specific time. While this information can be obtained for many real populations today, the dynamics of these populations through time is unlikely to be known. However, an analysis of migration is still possible in many instances because basic features, such as the number of populations, impose important mathematical constraints on the evolution of the system. Indeed, the combinatorial analysis of \(CombMS(n)\) has already revealed several such constraints. Moreover, Rivin’s theorem can be interpreted as a set of equalities and inequalities that, given necessary and sufficient conditions, determine a system of linear inequalities whose solutions lie on a polytope that is associated uniquely with the system under study.

^{5}and its dual graph \(G^{\circ }\) with edges marked using the convention that pairs of dual edges have the same label. The intensity of migration between two populations joined by an edge in \(G^{\circ }\) is represented by the variable \(x_i\). Hence, we assign the linear system \(L^{\circ }(G,T)\) to \(G\) and \(T\) as:

- 1.
\(0<x_i<1\)

- 2.
\(\sum _{j \in E(P_i)}x_j=T_i\), for each vertex \(P_i \in V(G^{\circ })\), where \(E(P_i)\) and \(V(G^{\circ })\) are the set of edges incident to \(P_i\) and the set of vertices of \(G^{\circ }\), respectively.

*Convex*; Franz 2000).

Although we could associate a non-empty polytope with the genetic system above, an abstract graph that can be embedded in the sphere is not necessarily realizable in spherical geometry. Further, spherical realizability for a given abstract graph does not imply realizability of the genetic linear system associated with the same graph. Important properties of the families of a polytope associated with migration patterns will be considered in the next section, which considers the underlying combinatorics of \(CombMS(n)\). Our result links this new theory of migration patterns to polytope theory, which is a strong branch of pure and applied mathematics (see Ball 1997; Bayer and Lee 1993; Bokowski and Sturmfels 1989; Gruber and Wills 1993; Grünbaum et al. 2003; Ziegler 1994).

### 2.4 Population histories as paths on \(\mathfrak {K}\)

The migration pattern associated with a metapopulation \(M(t)=(G(t),\varTheta (t), W(t),T(t))\) is likely to change through time. However, \(M(t)\) should change relatively slowly for processes that occur over long time scales. This implies that there are periods of time when the graph \(G(t)\) is invariant. Thus, for a period of time \(I\) when \(G(t)\) is fixed, the polytope pair \(K=(K_{[1]}(t),K_{[2]}(t))\) associated with \(M(t)\) is also fixed. In this way, \(M\) is a curve in \(K(t)\)—a curve in polytope \(K_{[1]}(t)\) and a curve in polytope \(K_{[2]}(t)\) that describes changes in the geographical arrangement of populations and gene flow between them. However, since the analysis is similar for both polytopes in most cases, we will describe \(K\) as a single polytope when describing common properties.

The evolution of a set of populations can be described as a path in \(K\) for some \(t \in I\). Further, most systems will eventually undergo a change in \(G(t)\), caused by a contraction or expansion move, and hence, the polytope \(K(t)\) will also change. More generally still, a collection of polytopes appears in this process, and the structure of this collection of polytopes mirrors the combinatorial structure \(CombMS(n)\).

To describe this new structure, we first consider \(CombMS(n)_0\), an arbitrary cubic graph with labeling \(l\), and extend its labeling to the whole stratum \(CombMS(n)_0\) while keeping all labels invariant during any Whitehead moves.^{6} Since each labeled graph has an associated polytope, a polytope complex can be constructed that depends only on the initial labeled graph.

- 1.
Each polytope of \(\mathfrak {K}\) is associated with a unique cubic graph.

- 2.
If \(G_1\) and \(G_2\) are cubic graphs connected by a Whitehead move on edge \(e\), with associated polytopes \(K_1\) and \(K_2\), their shared common subgraph \(G_{12}\)

^{7}corresponds to a common face of \(K_1\) and \(K_2\). This is true because any solution for the linear system of \(G_{12}\) can be viewed as a solution for \(G_1\) and \(G_2\) with the variable corresponding to the contracted edge \(e\) set to zero. - 3.
If a graph \(G\) associated with a polytope in \(\mathfrak {K}\) is not cubic, its associated polytope can be viewed as a subset of a polytope associated with a cubic graph.

- 4.
The polytope complex \(\mathfrak {K}\) depends of the original labeled cubic graph. However, as labeling is related by permutation, corresponding polytopes are related in a similar way.

- 5.
\(\mathfrak {K}\) has a finite number of polytopes because the number of cubic graphs, as well as the number of edge labeling possibilities, is finite for a fixed number of vertices.

- 6.
Polytopes in \(\mathfrak {K}\) may have common solutions due to the fact that different cubic graphs can give rise to the same contracted graph.

- 7.
The labeling of a graph determines the Euclidean coordinates for a given migration pattern. However, these coordinates are not unique in the complex \(\mathfrak {K}\) because a sequence of Whitehead moves can produce several copies of a graph \(G\) with different labelings. Nevertheless, considering symmetry of the graphs under translation and rotation, we can say that those coordinates that are symmetric represent the same migration pattern for any given graph.

In our view, knowledge of the properties above is important because it provides insight into the dynamics of migration patterns and highlights some considerations that would seem necessary to develop simulation models based on this theory. For example, if we know the migration pattern of a metapopulation at some time \(t\) and the associated graph is cubic, we can deduce that the associated graph does not change over short time periods, both future and past, with respect to \(t\). We can also deduce that the graph experiences a transition only if the path of the system crosses a face of the associated polytope. An additional application of this theory allows us to simulate metapopulation dynamics by starting with construction of the complex \(\mathfrak {K}\). However, we do not need to construct all possible graphs associated with a migration pattern. The properties above instead let us compute the cubic building blocks of \(\mathfrak {K}\), which greatly simplifies this task.

Additional mathematical development on the theory of migration patterns seems unnecessary for modeling purposes, and the role of \(\mathfrak {K}\) as a feasible computable space where migration histories can be represented as paths is now clear. Only one key point remains: the flexibility to add (or remove) a population from the study system.

### 2.5 The birth of new populations

- 1.
The Voronoi cell \(F_i\) centered at point \(P_i\) is the intersection of the entire half sphere \(R_{ij}\) determined by the perpendicular bisector \(b_{ij}\) of the segment joining the points \(P_i\) and \(P_j\). However, if \(F_i\) is surrounded by cells \(F_1,F_2, \ldots , F_m\), then \(F_i\) is the intersection of the half sphere \(R_{ik}\), where \(k \in \{1,2,\ldots , m\}\).

- 2.
If the point \(P'_i\) is located in a circular neighborhood \(B(P_i,\epsilon )\) with center \(P_i\) and radius \(\epsilon \), and \(P'\) is obtained from \(P\) by replacing \(P_i\) by \(P'_i\), then the Voronoi cell decomposition determined by \(P'\) would be close

^{8}to the Voronoi cell decomposition determined by \(P\) if \(\epsilon \) is sufficiently small.

## 3 Empirical example

^{9}which allows us to compute the inner angles based in the spherical law of cosines (Gellert et al. 1989). This process leads to the Delaunay cell decomposition of the sphere (Fig. 11), which in our case is a triangulation, and its associated inner triangles (Table 2). To describe the migration pattern for Sumba completely, we also included values for the weights \(w_i\) (i.e., measures of gene flow), which in this instance we chose to be the \(F_{ST}\) distance between population-level mitochondrial DNA diversity (i.e., a measure of relatedness along the maternal line; Lansing et al. 2007).

Geographical coordinates for eight Sumba populations

Population | Longitude | Latitude |
---|---|---|

Kodi | 118.960 | \(-\)9.583 |

Lamboya | 119.355 | \(-\)9.722 |

Loli | 119.398 | \(-\)9.633 |

Wanokaka | 119.449 | \(-\)9.725 |

Mamboro | 119.545 | \(-\)9.401 |

Anakalang | 119.575 | \(-\)9.588 |

Wunga | 119.958 | \(-\)9.385 |

Rindi | 120.675 | \(-\)9.931 |

Values of the inner angles \(\theta _i\; (i=1,2,\ldots , 18)\) using \(F_{ST}\) weights

Edge label (i) | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|

\(\theta _i\) | 0.7624459 | 1.943735 | 1.875696 | 2.187413 | 1.400967 | 1.189525 |

\(w_i\) | 0.01113 | 0.02481 | 0.06764 | 0.01297 | 0.06614 | 0.0461 |

Edge label (i) | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|

\(\theta _i\) | 1.681032 | 1.013773 | 0.1476047 | 2.743607 | 1.337191 | 2.014021 |

\(w_i\) | 0.0098 | 0.00572 | 0.01157 | 0.02015 | 0.04181 | 0.06872 |

Edge label (i) | 13 | 14 | 15 | 16 | 17 | 18 |
---|---|---|---|---|---|---|

\(\theta _i\) | 1.927833 | 1.054552 | 1.286779 | 0.1883666 | 2.101859 | 0.2763413 |

\(w_i\) | 0.03533 | 0.04736 | 0.01528 | 0.03312 | 0.00891 | 0.00536 |

^{10}

The question that arises in this context is how to determine the path that populations on Sumba took in the past to reach their present state. While this is a challenging problem, the theory presented here provides a natural analytical framework to address it. However, results must be interpreted in the context of each specific problem. For instance, if we consider a period of time when all migrations were restricted to Sumba, then edges \(w_{16}, w_{17}, w_{18}\) would be not considered possible migration pathways, even though they are necessary to fully describe the migration pattern of Sumba’s populations today.

## 4 Discussion and future directions

Population structure has long been known to play a central role in the dynamics of population genetic variation through time. The role of migration has been particularly emphasized by several authors (Hanski and Gilping 1997; Slatkin 1985, 1987). Reconstructing the migration history for a set of populations is crucial for fully understanding the genetics of modern populations. As a step towards this goal, we have presented two equivalent perspectives on the movement of populations (as well as individuals between those populations). These graphs are mirror images of each other, and are based on the graph duality of Voronoi and Delaunay cell decomposition of a two-dimensional sphere. The Voronoi perspective of migration illustrates history as a graph weighted by migration, which suffers continuous deformation through time due to changes in population and individual mobility. Based on contraction and expansion moves in the static case (i.e., a fixed number of populations), splitting or merging populations is also possible. The Delaunay perspective is then represented as dynamic spheres with evolving cell decompositions driven by the addition or deletion of edges (dual to expansion/contraction moves), as well as the addition or deletion of rhomboid structures representing new or merged populations.^{11}

Representing the dynamics of population and individual mobility by paths on the polytope complex \(\mathfrak {K}\) simplifies the analysis of migration. In this setting, migration histories, represented as paths in \(\mathfrak {K}\), have several analytical advantages: (i) several migration scenarios can be compared by examining the matrices encoding their paths, (ii) different metapopulation systems that are not necessarily related geographically or temporally can be quantitatively compared; and (iii) knowledge of a particular migration pattern can constrain future or past paths in \(\mathfrak {K}\), thus reducing the search space in an inferential statistics setting.

The analytical framework presented here allows the number of populations to change as a consequence of splitting and merger events. These splits and mergers could occur sequentially or simultaneously, both of which can be explained using the same construction represented in Fig. 10 (or a straightforward generalization of this construction based on the premise that Voronoi diagrams are largely stable under minor perturbation of the Voronoi cell centers). This also implies that, under significant differences in scale (e.g., regional to global scales), parts of the Voronoi cell decomposition can simply be replaced by a point. This is still a valid representation of the migration system, although now containing less detailed information. For instance, if a set of populations includes groups separated by thousands of kilometers as well as populations separated by only a few kilometers, then close population groups could be modeled at the global scale as a single point. This simplification could be employed to study the dynamics of large scale systems, knowing that local population dynamics could subsequently be re-integrated into the system if required at a later time. This is an especially useful feature, as it can also be employed to handle missing data (which is ubiquitous in most biological datasets).

Further research on the evolution of migration patterns based on a polytope complex is also possible. One productive avenue of research will be optimizing measures of migration to capture the dynamic nature of this process. Although challenging (see Whitlock 1999 for details), there has been substantial progress towards this end in recent years (Hey 2010; Kuhner 2006). Further, there is no obvious standard for how the total weights \(T\) assigned to the vertices of a migration graph evolve through time. Although we have previously fixed \(T\), it is equally reasonable to consider that \(T\) changes through time, thus producing a dynamic polytope complex \(\mathfrak {K}\). Hence, the migration history of a set of populations could be viewed as a path in a dynamic \(\mathfrak {K}\) with moving walls (i.e., facets of \(\mathfrak {K}\)).

Further to this idea, a simulation approach based on \(\mathfrak {K}\) could be an appropriate starting point to understand changes in population and individual mobility through time. Given known initial and final configurations (as in the ‘Out of Africa’ example used in Sect. 1), Hidden Markov Models with migration patterns as their hidden states might prove a useful way to determine the most likely migration path between two (or more) graph topologies. Frameworks such as these would be radically different to traditional gene lineage based simulators, like SPLATCHE (Ray et al. 1991).

## Footnotes

- 1.
Homeomorphism indicates that a function exists with continuous inverse of the sphere that transforms the first graph into the second.

- 2.
We allow a graph to be a hypergraph—that is, it can contain multiple edges between two vertices.

- 3.
Here, we specifically have in mind genetic measures such as \(F_{ST}\), although alternative metrics, including estimates of migration, either direct or inferred (Hey 2010), are equally well suited.

- 4.
Note that \(CombMS(n)\) does not depend on \(T\).

- 5.
Edge labeling is a one-to-one assignment of the numbers \(1,2, \ldots , m\) to the set of edges.

- 6.
At the intermediate step of a Whitehead move, a label is momentarily lost whenever an edge is deleted. We define that, in the expanding step, the new edge is given the label of the old lost edge.

- 7.
The common subgraph \(G_{12}\) is obtained in the intermediate step of the Whitehead move connecting \(G_1\) and \(G_2\).

- 8.
Close in the sense that the edges of \(P\) could be covered by circles of some small radius \(\epsilon \), such that the edges of \(P'\) are contained in the union of this cover.

- 9.
Each spherical triangle has two circumcenters, which are the antipodes of each other.

- 10.
Note that our data also fulfills the second condition of Rivin’s theorem.

- 11.
A rhomboid structure is represented by the union of blue and green edges in Fig. 10.

- 12.
We need only consider edge paths that do not have edges on the boundary of \(B\) and \(T\).

- 13.
This could be studied formally using ideas from general topology or homotopy theory.

## Notes

### Acknowledgments

The Royal Society of New Zealand supported this research via a Rutherford Fellowship (RDF-10-MAU-001) and Marsden Grant (11-MAU-007) to MPC.

## References

- Amaris AJR (2007) Weierstrass points and canonical cell decompositions of the Moduli and Teichmüller Spaces of Riemann surfaces of genus two. Ph.D. Thesis, Mathematics and Statistics, University of Melbourne. http://repository.unimelb.edu.au/10187/2259
- Ball K (1997) An elementary introduction to modern convex geometry. In: Levy S (ed) Flavors of geometry, vol 31. Cambridge University Press, Cambridge, pp 1–58Google Scholar
- Ban YEA, Edelsbrunner H, Rudolph J (2004) Interface surfaces for protein–protein complexes. In: Proceedings of the \(8^{{\rm th}}\) annual international conference on research in computational molecular biology, San Diego, CA, USA. pp 205–212Google Scholar
- Bayer MM, Lee CW (1993) Combinatorial aspects of convex polytopes. In: Gruber PM, Wills JM (eds) Handbook of convex geometry. North-Holland, Amsterdam, pp 485–534Google Scholar
- Beerli P (1998) Estimation of migration rates and population sizes in geographically structured populations. In: Advances in molecular ecology. GR Carvalho, IOS Press, Amsterdam, pp 39–53Google Scholar
- Bokowski J, Sturmfels B (1989) Computational synthetic geometry. In: Lecture notes in mathematics, vol 1355. Springer, BerlinGoogle Scholar
- Bowers PL, Stephenson K (1996) A branched Andreev–Thurston theorem for circle packings of the sphere. Proc Lond Math Soc 73:185–215MathSciNetCrossRefzbMATHGoogle Scholar
- Cox MP, Hammer MF (2010) A question of scale: human migrations writ large and small. BMC Biol 8:98CrossRefGoogle Scholar
- Eils R, Bertin E, Saragoglu K, Rinke B, Schröck E, Parazza F, Usson Y, Robert-Nicoud M, Stelzer EHK, Chassery JM, Cremmer T, Cremmer C (1995) Application of confocal laser microscopy and three-dimensional Voronoi diagrams for volume and surface estimates of interphase chromosomes. J Microsc 177:150–161CrossRefGoogle Scholar
- Flegg HG (2001) From geometry to topology. Dover, LondonzbMATHGoogle Scholar
- Gruber PM, Wills JM (1993) Handbook of convex geometry. North-Holland, AmsterdamGoogle Scholar
- Grünbaum B, Kaibel V, Klee V, Ziegler G (2003) Convex polytopes. Springer, BerlinCrossRefGoogle Scholar
- Franz M (2000) Convex: a Maple package for convex geometry. http://www-fourier.ujf-grenoble.fr/~franz/convex/
- Gellert W, Gottwald S, Hellwich M, Kästner H, Künstner H (1989) Spherical trigonometry. In: VNR concise encyclopedia of mathematics, 2nd edn. Van Nostrand Reinhold, New York, pp 261–282Google Scholar
- Hanski IA, Gilping ME (1997) Metapopulation biology: ecology, genetics and evolution. Academic Press, New YorkzbMATHGoogle Scholar
- Hey J (2010) Isolation with migration models for more than two populations. Mol Biol Evol 27(4):905–920CrossRefGoogle Scholar
- Holsinger KE, Weir BS (2009) Genetics in geographically structured populations: defining, estimating and interpreting \(F_{ST}\). Nat Rev Genet 10:639–650CrossRefGoogle Scholar
- Kuhner MK (2006) LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22(6):768–770CrossRefGoogle Scholar
- Kim DS (2004) Euclidean Voronoi diagram of atoms and protein. Proceedings of the \(1^{{\rm st}}\) biogeometry meeting. In: Conjunction with the ACM symposium on computational geometry. Technical University of New York, New YorkGoogle Scholar
- Lansing JS, Cox MP, Downey SS, Gabler BM, Hallmark B, Karafet TM, Norquest P, Schoenfelder JW, Sudoyo H, Watkins JC, Hammer MF (2007) Coevolution of languages and genes on the island of Sumba, eastern Indonesia. Proc Natl Acad Sci USA 104(41):16022–16026CrossRefGoogle Scholar
- Lee B, Richard FM (1971) The interpretation of protein structures: estimation of static accessibility. J Mol Biol 55:379–400CrossRefGoogle Scholar
- Maruvka YE, Shnerb NM, Solomon S, Gur Y, Kessler DA (2011) Slicing and dicing the genome: a statistical physics approach to population genetics. J Stat Phys 142:1302–1316CrossRefzbMATHGoogle Scholar
- Michalakis Y, Excoffier L (1996) A generic estimation of population subdivision using distances between alleles with special reference for microsatellite loci. Genetics 142:1061–1064Google Scholar
- Okabe A, Boots AB, Sugihara K, Chiu SN (2000) Spatial tessellations: concepts and applications of Voronoi diagrams. John Wiley, New YorkCrossRefGoogle Scholar
- Kasra R, Tao J (2013) The diameter of the thick part of moduli space and simultaneous whitehead moves. Duke Math J 162:10Google Scholar
- Rannala B, Hartigan JA (1996) Estimating gene flow in island populations. Genet Res 67:147–158CrossRefGoogle Scholar
- Ray N, Currat M, Foll M, Excoffier L (2010) SPLATCHE2: a spatially-explicit simulation framework for complex demography, genetic admixture and recombination. Bioinformatics 26(3):2993–2994CrossRefGoogle Scholar
- Richeson DS (2008) Euler’s gem: the polyhedron formula and the birth of topology. Princeton University Press, PrincetonGoogle Scholar
- Rivin I (1996) A characterization of ideal polyhedra in hyperbolic 3-space. Ann Math 143:51–70MathSciNetCrossRefzbMATHGoogle Scholar
- Slatkin M (1985) Gene flow in natural populations. Ann Rev Ecol Syst 16:393–430CrossRefGoogle Scholar
- Slatkin M (1987) Gene flow and the geographical structure of natural populations. Science 236:787–792CrossRefGoogle Scholar
- Springborn BA (2003) Variational principles for circle patterns. http://arxiv.org/abs/math/0312363
- Wakeley J (1998) Segregating sites in Wright’s island model. Theor Pop Biol 53:166–174CrossRefzbMATHGoogle Scholar
- Whitlock MC, Maccauley DE (1999) Indirect measure of gene flow and migration: \(F_{st} \ne 1/(Nm+1)\). Heredity 82:117–125CrossRefGoogle Scholar
- Zheng X, Ennis R, Richards GP, Palffy-Muhoray P (2011) A plane sweep algorithm for the Voronoi tessellation of the sphere. http://e-lc.org
- Ziegler G (1994) Lectures on Polytopes. Springer, BerlinGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.